You are not authenticated, login.
text: sort by
tags: modified
type: chronology
{1403} is owned by guest.{1386} is owned by jamesjun.{455} is owned by tlh24 ehanson.{1222} is owned by tlh24.{1104} is owned by tlh24.{481} is owned by tlh24.{319} is owned by tlh24.{231} is owned by tlh24.{725} is owned by tlh24.{884} is owned by agnewlife.{878} is owned by tlh24.{868} is owned by tlh24.{822} is owned by tlh24.{851} is owned by tlh24.{847} is owned by tlh24.{845} is owned by tlh24.{816} is owned by tlh24.{469} is owned by tlh24.{745} is owned by tlh24.{667} is owned by tlh24.{721} is owned by tlh24 misha.{615} is owned by tlh24.{670} is owned by tlh24.{433} is owned by tlh24.{555} is owned by tlh24 ryohei.{602} is owned by tlh24 ana.{509} is owned by tlh24.{578} is owned by tlh24.{561} is owned by tlh24 ryohei.{548} is owned by tlh24 ryohei.{569} is owned by tlh24 ryohei.{547} is owned by tlh24 ryohei.{560} is owned by tlh24 ryohei.{564} is owned by tlh24 ryohei.{562} is owned by tlh24 ryohei.{563} is owned by tlh24 ryohei.{556} is owned by tlh24 ryohei.{549} is owned by tlh24.{543} is owned by tlh24.{541} is owned by tlh24.{534} is owned by tlh24.{535} is owned by tlh24.{524} is owned by tlh24.{492} is owned by tlh24.{516} is owned by tlh24.{518} is owned by tlh24.{517} is owned by tlh24.{515} is owned by tlh24.{512} is owned by tlh24.{510} is owned by tlh24.{505} is owned by tlh24 ana.{508} is owned by tlh24.{499} is owned by tlh24.{490} is owned by tlh24.{488} is owned by tlh24.{461} is owned by tlh24.{391} is owned by tlh24.{430} is owned by tlh24.{453} is owned by tlh24.{89} is owned by tlh24 ana.{421} is owned by tlh24.{436} is owned by tlh24.{435} is owned by tlh24.{431} is owned by tlh24.{399} is owned by tlh24.{426} is owned by tlh24.{423} is owned by tlh24.{408} is owned by tlh24.{346} is owned by tlh24.{380} is owned by tlh24.{395} is owned by tlh24.{396} is owned by tlh24.{406} is owned by tlh24.
[0] Gage GJ, Ludwig KA, Otto KJ, Ionides EL, Kipke DR, Naive coadaptive cortical control.J Neural Eng 2:2, 52-63 (2005 Jun)

[0] Jackson A, Mavoori J, Fetz EE, Correlations between the same motor cortex cells and arm muscles during a trained task, free behavior, and natural sleep in the macaque monkey.J Neurophysiol 97:1, 360-74 (2007 Jan)

[0] Schmidt EM, McIntosh JS, Durelli L, Bak MJ, Fine control of operantly conditioned firing patterns of cortical neurons.Exp Neurol 61:2, 349-69 (1978 Sep 1)[1] Serruya MD, Hatsopoulos NG, Paninski L, Fellows MR, Donoghue JP, Instant neural control of a movement signal.Nature 416:6877, 141-2 (2002 Mar 14)[2] Fetz EE, Operant conditioning of cortical unit activity.Science 163:870, 955-8 (1969 Feb 28)[3] Fetz EE, Finocchio DV, Operant conditioning of specific patterns of neural and muscular activity.Science 174:7, 431-5 (1971 Oct 22)[4] Fetz EE, Finocchio DV, Operant conditioning of isolated activity in specific muscles and precentral cells.Brain Res 40:1, 19-23 (1972 May 12)[5] Fetz EE, Baker MA, Operantly conditioned patterns on precentral unit activity and correlated responses in adjacent cells and contralateral muscles.J Neurophysiol 36:2, 179-204 (1973 Mar)

hide / / print
ref: -2021 tags: FIBSEM electron microscopy presynaptic plasticity activity Funke date: 10-12-2021 17:03 gmt revision:0 [head]

Ultrastructural readout of in vivo synaptic activity for functional connectomics

  • Anna Simon, Arnd Roth, Arlo Sheridan, Mehmet Fişek, Vincenzo Marra, Claudia Racca, View ORCID ProfileJan Funke, View ORCID ProfileKevin Staras, Michael Häusser
  • Did FIB-SEM on FM1-43 dye labeled synapses, then segmented the cells using machine learning, as Jan has pioneered.
    • FM1-43FX is membrane impermeable, and labels only synaptic vesicles that have been recycled after dye loading. (Invented in 1992!)
    • FM1-43FX is also able to photoconvert diaminobenzidene (DAB) into a amorphous highly conjugated polymer with high affinity for osmium tetroxide
  • This allows for a snapshot of ultrastructural presynaptic plasticity / activity.
  • N=84 boutons, but n=7 pairs / triples of boutons from the same axon.
    • These boutons have the same presynaptic spiking activity, and hence are expected to have the same release probability, and hence the same photoconversion (PC) labeling.
      • But they don't! The ratio of PC+ vesicle numbers between boutons on the same neuron is low, mean < 0.4, which suggests some boutons have high neurotransmitter release and recycling, others have low...
  • Quote in the abstract: We also demonstrate that neighboring boutons of the same axon, which share the same spiking activity, can differ greatly in their presynaptic release probability.
    • Well, sorta, the data here is a bit weak. It might all be lognormal fluctuations, as has been well demonstrated.
    • When I read it I was excited to think of the influence of presynaptic inhibition / modulation, which has not been measured here, but is likely to be important.

hide / / print
ref: -2020 tags: dreamcoder ellis program induction ai tenenbaum date: 10-10-2021 17:32 gmt revision:2 [1] [0] [head]

DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning

  • Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, Joshua B. Tenenbaum

This paper describes a system for adaptively finding programs which succinctly and accurately produce desired output. These desired outputs are provided by the user / test system, and come from a number of domains:

  • list (as in lisp) processing,
  • text editing,
  • regular expressions,
  • line graphics,
  • 2d lego block stacking,
  • symbolic regression (ish),
  • functional programming,
  • and physcial laws.
Some of these domains are naturally toy-like, eg. the text processing, but others are deeply impressive: the system was able to "re-derive" basic physical laws of vector calculus in the process of looking for S-expression forms of cheat-sheet physics equations. These advancements result from a long lineage of work, perhaps starting from the Helmholtz machine PMID-7584891 introduced by Peter Dayan, Geoff Hinton and others, where onemodel is trained to generate patterns given context (e.g.) while a second recognition module is trained to invert this model: derive context from the patterns. The two work simultaneously to allow model-exploration in high dimensions.

Also in the lineage is the EC2 algorithm, which most of the same authors above published in 2018. EC2 centers around the idea of "explore - compress" : explore solutions to your program induction problem during the 'wake' phase, then compress the observed programs into a library by extracting/factoring out commonalities during the 'sleep' phase. This of course is one of the core algorithms of human learning: explore options, keep track of both what worked and what didn't, search for commonalities among the options & their effects, and use these inferred laws or heuristics to further guide search and goal-setting, thereby building a buffer attack the curse of dimensionality. Making the inferred laws themselves functions in a programming library allows hierarchically factoring the search task, making exploration of unbounded spaces possible. This advantage is unique to the program synthesis approach.

This much is said in the introduction, though perhaps with more clarity. DreamCoder is an improved, more-accessible version of EC2, though the underlying ideas are the same. It differs in that the method for constructing libraries has improved through the addition of a powerful version space for enumerating and evaluating refactors of the solutions generated during the wake phase. (I will admit that I don't much understand the version space system.) This version space allows DreamCoder to collapse the search space for re-factorings by many orders of magnitude, and seems to be a clear advancement. Furthermore, DreamCoder incorporates a second phase of sleep: "dreaming", hence the moniker. During dreaming the library is used to create 'dreams' consisting of combinations of the library primitives, which are then executed with training data as input. These dreams are then used to train up a neural network to predict which library and atomic objects to use in given contexts. Context in this case is where in the parse tree a given object has been inserted (it's parent and which argument number it sits in); how the data-context is incorporated to make this decision is not clear to me (???).

This neural dream and replay-trained neural network is either a GRU recurrent net with 64 hidden states, or a convolutional network feeding into a RNN. The final stage is a linear ReLu (???) which again is not clear how it feeds into the prediction of "which unit to use when". The authors clearly demonstrate that the network, or the probabalistic context-free grammar that it controls (?) is capable of straightforward optimizations, like breaking symmetries due to commutativity, avoiding adding zero, avoiding multiplying by one, etc. Beyond this, they do demonstrate via an ablation study that the presence of the neural network affords significant algorithmic leverage in all of the problem domains tested. The network also seems to learn a reasonable representation of the sub-type of task encountered -- but a thorough investigation of how it works, or how it might be made to work better, remains desired.

I've spent a little time looking around the code, which is a mix of python high-level experimental control code, and lower-level OCaml code responsible for running (emulating) the lisp-like DSL, inferring type in it's polymorphic system / reconciling types in evaluated program instances, maintaining the library, and recompressing it using aforementioned version spaces. The code, like many things experimental, is clearly a work-in progress, with some old or unused code scattered about, glue to run the many experiments & record / analyze the data, and personal notes from the first author for making his job talks (! :). The description in the supplemental materials, which is satisfyingly thorough (if again impenetrable wrt version spaces), is readily understandable, suggesting that one (presumably the first) author has a clear understanding of the system. It doesn't appear that much is being hidden or glossed over, which is not the case for all scientific papers.

With the caveat that I don't claim to understand the system to completion, there are some clear areas where the existing system could be augmented further. The 'recognition' or perceptual module, which guides actual synthesis of candidate programs, realistically can use as much information as is available in DreamCoder as is available: full lexical and semantic scope, full input-output specifications, type information, possibly runtime binding of variables when filling holes. This is motivated by the way that humans solve problems, at least as observed by introspection:
  • Examine problem, specification; extract patterns (via perceptual modules)
  • Compare patterns with existing library (memory) of compositionally-factored 'useful solutions' (this is identical to the library in DreamCoder)* Do something like beam-search or quasi stochastic search on selected useful solutions. This is the same as DreamCoder, however human engineers make decisions progressively, at runtime so-to-speak: you fill not one hole per cycle, but many holes. The addition of recursion to DreamCoder, provided a wider breadth of input information, could support this functionality.
  • Run the program to observe input-output .. but also observe the inner workings of the program, eg. dataflow patterns. These dataflow patterns are useful to human engineers when both debugging and when learning-by-inspection what library elements do. DreamCoder does not really have this facility.
  • Compare the current program results to the desired program output. Make a stochastic decision whether to try to fix it, or to try another beam in the search. Since this would be on a computer, this could be in parallel (as DreamCoder is); the ability to 'fix' or change a DUT is directly absent dreamcoder. As an 'deeply philosophical' aside, this loop itself might be the effect of running a language-of-thought program, as was suggested by pioneers in AI (ref). The loop itself is subject to modification and replacement based on goal-seeking success in the domain of interest, in a deeply-satisfying and deeply recursive manner ...
At each stage in the pipeline, the perceptual modules would have access to relevant variables in the current problem-solving context. This is modeled on Jacques Pitrat's work. Humans of course are even more flexible than that -- context includes roughly the whole brain, and if anything we're mushy on which level of the hierarchy we are working.

Critical to making this work is to have, as I've written in my notes many years ago, a 'self compressing and factorizing memory'. The version space magic + library could be considered a working example of this. In the realm of ANNs, per recent OpenAI results with CLIP and Dall-E, really big transformers also seem to have strong compositional abilities, with the caveat that they need to be trained on segments of the whole web. (This wouldn't be an issue here, as Dreamcoder generates a lot of its own training data via dreams). Despite the data-inefficiency of DNN / transformers, they should be sufficient for making something in the spirit of above work, with a lot of compute, at least until more efficient models are available (which they should be shortly; see AlphaZero vs MuZero).

hide / / print
ref: -2020 tags: excitatory inhibitory balance E-I synapses date: 10-06-2021 17:50 gmt revision:1 [0] [head]

Whole-Neuron Synaptic Mapping Reveals Spatially Precise Excitatory/Inhibitory Balance Limiting Dendritic and Somatic Spiking

We mapped over 90,000 E and I synapses across twelve L2/3 PNs and uncovered structured organization of E and I synapses across dendritic domains as well as within individual dendritic segments. Despite significant domain-specific variation in the absolute density of E and I synapses, their ratio is strikingly balanced locally across dendritic segments. Computational modeling indicates that this spatially precise E/I balance dampens dendritic voltage fluctuations and strongly impacts neuronal firing output.

I think this would be tenuous, but they did do patch-clamp recording to back it up, but it's vitally interesting from a structural standpoint. Plus, this is a enjoyable, well-written paper :-)

hide / / print
ref: -2019 tags: HSIC information bottleneck deep learning backprop gaussian kernel date: 10-06-2021 17:23 gmt revision:5 [4] [3] [2] [1] [0] [head]

The HSIC Bottleneck: Deep learning without Back-propagation

In this work, the authors use a kernelized estimate of statistical independence as part of a 'information bottleneck' to set per-layer objective functions for learning useful features in a deep network. They use the HSIC, or Hilbert-schmidt independence criterion, as the independence measure.

The information bottleneck was proposed by Bailek (spikes..) et al in 1999, and aims to increase the mutual information between the layer representation and the labels while minimizing the mutual information between the representation and the input:

minP T i|XI(X;T i)βI(T i;Y)\frac{min}{P_{T_i | X}} I(X; T_i) - \beta I(T_i; Y)

Where T iT_i is the hidden representation at layer i (later output), XX is the layer input, and YY are the labels. By replacing I()I() with the HSIC, and some derivation (?), they show that

HSIC(D)=(m1) 2tr(K XHK YH)HSIC(D) = (m-1)^{-2} tr(K_X H K_Y H)

Where D=(x 1,y 1),...(x m,y m)D = {(x_1,y_1), ... (x_m, y_m)} are samples and labels, K X ij=k(x i,x j)K_{X_{ij}} = k(x_i, x_j) and K Y ij=k(y i,y j)K_{Y_{ij}} = k(y_i, y_j) -- that is, it's the kernel function applied to all pairs of (vectoral) input variables. H is the centering matrix. The kernel is simply a Gaussian kernel, k(x,y)=exp(1/2||xy|| 2/σ 2)k(x,y) = exp(-1/2 ||x-y||^2/\sigma^2) . So, if all the x and y are on average independent, then the inner-product will be mean zero, the kernel will be mean one, and after centering will lead to zero trace. If the inner product is large within the realm of the derivative of the kernel, then the HSIC will be large (and negative, i think). In practice they use three different widths for their kernel, and they also center the kernel matrices.

But still, the feedback is an aggregate measure (the trace) of the product of two kernelized (a nonlinearity) outer-product spaces of similarities between inputs. it's not unimaginable that feedback networks could be doing something like this...

For example, a neural network could calculate & communicate aspects of joint statistics to reward / penalize weights within a layer of a network, and this is parallelizable / per layer / adaptable to an unsupervised learning regime. Indeed, that was done almost exactly by this paper: Kernelized information bottleneck leads to biologically plausible 3-factor Hebbian learning in deep networks albeit in a much less intelligible way.

Robust Learning with the Hilbert-Schmidt Independence Criterion

Is another, later, paper using the HSIC. Their interpretation: "This loss-function encourages learning models where the distribution of the residuals between the label and the model prediction is statistically independent of the distribution of the instances themselves." Hence, given above nomenclature, E X(P T i|XI(X;T i))=0 E_X( P_{T_i | X} I(X ; T_i) ) = 0 (I'm not totally sure about the weighting, but might be required given the definition of the HSIC.)

As I understand it, the HSIC loss is a kernellized loss between the input, output, and labels that encourages a degree of invariance to input ('covariate shift'). This is useful, but I'm unconvinced that making the layer output independent of the input is absolutely essential (??)

hide / / print
ref: -2020 tags: Principe modular deep learning kernel trick MNIST CIFAR date: 10-06-2021 16:54 gmt revision:2 [1] [0] [head]

Modularizing Deep Learning via Pairwise Learning With Kernels

  • Shiyu Duan, Shujian Yu, Jose Principe
  • The central idea here is to re-interpret deep networks, not with the nonlinearity as the output of a layer, but rather as the input of the layer, with the regression (weights) being performed on this nonlinear projection.
  • In this sense, each re-defined layer is implementing the 'kernel trick': tasks (like classification) which are difficult in linear spaces, become easier when projected into some sort of kernel space.
    • The kernel allows pairwise comparisons of datapoints. EG. a radial basis kernel measures the radial / gaussian distance between data points. A SVM is a kernel machine in this sense.
      • As a natural extension (one that the authors have considered) is to take non-pointwise or non-one-to-one kernel functions -- those that e.g. multiply multiple layer outputs. This is of course part of standard kernel machines.
  • Because you are comparing projected datapoints, it's natural to take contrastive loss on each layer to tune the weights to maximize the distance / discrimination between different classes.
    • Hence this is semi-supervised contrastive classification, something that is quite popular these days.
    • The last layer is of tuned with cross-entropy labels, but only a few are required since the data is well distributed already.
  • Demonstrated on small-ish datasets, concordant with their computational resources ...

I think in general this is an important result, even if its not wholly unique / somewhat anticipated (it's a year old at the time of writing). Modular training of neural networks is great for efficiency, parallelization, and biological implementations! Transport of weights between layers is hence non-essential.

Classes still are, but I wonder if temporal continuity can solve some of these problems?

(There is plenty of other effort in this area -- see also {1544})

hide / / print
ref: -2014 tags: CNiFER Kleinfeld dopamine norepinephrine monoamine cell sensor date: 10-04-2021 14:50 gmt revision:2 [1] [0] [head]

Cell-based reporters reveal in vivo dynamics of dopamine and norepinephrine release in murine cortex

  • CNiFERs are clonal cell lines engineered to express a specific GPCR that is coupled to the Gq pathway and triggers an increase in intracellular calcium concentration, [Ca2+], which in turn is rapidly detected by a genetically encoded fluorescence resonance energy transfer (FRET)-based Ca2+ sensor. This system transforms neurotransmitter receptor binding into a change in fluorescence and provides a direct and real-time optical readout of local neurotransmitter activity. Furthermore, by using the natural receptor for a given transmitter, CNiFERs gain the chemical specificity and temporal dynamics present in vivo.
    • Clonal cell line = HEK293.
      • Human cells implanted into mice!
    • Gq pathway = through the phospholipase C-initosol triphosphate (PLC-IP3) pathway.
  • Dopamine sensor required the engineering of a chimeric Gqi5 protein for coupling to PLC. This was a 5-AA substitution (only!)

Referenced -- and used by the recent paper Reinforcement learning links spontaneous cortical dopamine impulses to reward, which showed that dopamine signaling itself can come under volitional, operant-conditioning (or reinforcement type) modulation.

hide / / print
ref: -2011 tags: government polyicy observability submerged state America date: 09-23-2021 22:06 gmt revision:0 [head]

The Submerged State -- How Invisible Government Policies Undermine American Democracy. By Suzanne Mettler

(I've not read this book, just the blurb, but it looks like a defensible thesis) : Government polyicy, rather than distributing resources (money, infrastructure, services) as directly as possible to voters, have recently opted to distribute indirectly, through private companies. This gives the market & private organizations more perceived clout, perpetuates a level of corruption, and undermines American's faith in their government.

So, we need a better 'debugger' for policy in america? Something like a discrete chain rule to help people figure out what policies (and who) are responsible for the good / bad things in their life? Sure seems that the bureaucracy is could use some cleanup / is failing under burgeoning complexity. This is probably not dissimilar to cruddy technical systems.

hide / / print
ref: -0 tags: gtk.css scrollbar resize linux qt5 date: 09-16-2021 15:18 gmt revision:2 [1] [0] [head]

Put this in ~/.config/gtk-3.0/gtk.css make scrollbars larger on high-DPI screens. ref

.scrollbar {
  -GtkScrollbar-has-backward-stepper: 1;
  -GtkScrollbar-has-forward-stepper: 1;
  -GtkRange-slider-width: 16;
  -GtkRange-stepper-size: 16;
scrollbar slider {
    /* Size of the slider */
    min-width: 16px;
    min-height: 16px;
    border-radius: 16px;

    /* Padding around the slider */
    border: 2px solid transparent;

.scrollbar.vertical slider,
scrollbar.vertical slider {
    min-height: 16px;
    min-width: 16px;

scrollbar.horizontal slider {
min-width: 16px;
min-height: 16px;

/* Scrollbar trough squeezes when cursor hovers over it. Disabling that

.scrollbar.vertical.dragging:dir(ltr) {
    margin-left: 0px;

.scrollbar.vertical.dragging:dir(rtl) {
    margin-right: 0px;

.scrollbar.horizontal.slider.dragging {
    margin-top: 0px;
undershoot.top, undershoot.right, undershoot.bottom, undershoot.left { background-image: none; }
undershoot.top, undershoot.right, undershoot.bottom, undershoot.left { background-image: none; }

To make the scrollbars a bit easier to see in QT5 applications, run qt5ct (after apt-getting it), and add in a new style sheet, /usr/share/qt5ct/qss/scrollbar-simple-backup.qss

/* SCROLLBARS (NOTE: Changing 1 subcontrol means you have to change all of them)*/
  background: palette(alternate-base);
  margin: 0px 0px 0px 0px;
  margin: 0px 0px 0px 0px;
  background: #816891;
  border: 1px solid transparent;
  border-radius: 1px;
QScrollBar::handle:hover, QScrollBar::add-line:hover, QScrollBar::sub-line:hover{
  background: palette(highlight);
subcontrol-origin: none;
QScrollBar::add-line:vertical, QScrollBar::sub-line:vertical{
height: 0px;
QScrollBar::add-line:horizontal, QScrollBar::sub-line:horizontal{
width: 0px;
subcontrol-origin: none;

hide / / print
ref: -2021 tags: gated multi layer perceptrons transformers ML Quoc_Le Google_Brain date: 08-05-2021 06:00 gmt revision:4 [3] [2] [1] [0] [head]

Pay attention to MLPs

  • Using bilinear / multiplicative gating + deep / wide networks, you can attain similar accuracies as Transformers on vision and masked language learning tasks! No attention needed, just a in-network multiplicative term.
  • And the math is quite straightforward. Per layer:
    • Z=σ(XU),,Z^=s(Z),,Y=Z^V Z = \sigma(X U) ,, \hat{Z} = s(Z) ,, Y = \hat{Z} V
      • Where X is the layer input, σ\sigma is the nonlinearity (GeLU), U is a weight matrix, Z^\hat{Z} is the spatially-gated Z, and V is another weight matrix.
    • s(Z)=Z 1(WZ 2+b) s(Z) = Z_1 \odot (W Z_2 + b)
      • Where Z is divided into two parts along the channel dimension, Z 1Z 2Z_1 Z_2 . 'circleDot' is element-wise multiplication, and W is a weight matrix.
  • You of course need a lot of compute; this paper has nice figures of model accuracy scaling vs. depth / number of parameters / size. I guess you can do this if you're Google.

Pretty remarkable that an industrial lab freely publishes results like this. I guess the ROI is that they get the resultant improved ideas? Or, perhaps, Google is in such a dominant position in terms of data and compute that even if they give away ideas and code, provided some of the resultant innovation returns to them, they win. The return includes trained people as well as ideas. Good for us, I guess!

hide / / print
ref: -2018 tags: luke metz meta learning google brain sgd model mnist Hebbian date: 08-05-2021 01:07 gmt revision:2 [1] [0] [head]

Meta-Learning Update Rules for Unsupervised Representation Learning

  • Central idea: meta-train a training-network (a MLP) which trains a task-network (also a MLP) to do unsupervised learning on one dataset.
  • The training network is optimized through SGD based on small-shot linear learning on a test set, typically different from the unsupervised training set.
  • The training-network is a per-weight MLP which takes in layer input, layer output, and a synthetic error (denoted η\eta ), and generates a and b, which are then fed into an outer-product Hebbian learning rule.
  • η\eta itself is formed through a backward pass through weights VV , which affords something like backprop -- but not exactly backprop, of course. See the figure.
  • Training consists of building up very large, backward through time gradient estimates relative to the parameters of the training-network. (And there are a lot!)
  • Trained on CIFAR10, MNIST, FashionMNIST, IMDB sentiment prediction. All have their input permuted to keep the training-network from learning per-task weights. Instead the network should learn to interpret the statistics between datapoints.
  • Indeed, it does this -- albeit with limits. Performance is OK, but only if you only do supervised learning on the very limited dataset used in the meta-optimization.
    • In practice, it's possible to completely solve tasks like MNIST with supervised learning; this gets to about 80% accuracy.
  • Images were kept small -- about 20x20 -- to speed up the inner loop unsupervised learning. Still, this took on the order of 200 hours across ~500 TPUs.
  • See, as a comparison, Keren's paper, Meta-learning biologically plausible semi-supervised update rules. It's conceptually nice but only evaluates the two-moons and two-gaussian datasets.

This is a clearly-written, easy to understand paper. The results are not highly compelling, but as a first set of experiments, it's successful enough.

I wonder what more constraints (fewer parameters, per the genome), more options for architecture modifications (e.g. different feedback schemes, per neurobiology), and a black-box optimization algorithm (evolution) would do?

hide / / print
ref: -0 tags: sparse coding reference list olshausen field date: 08-04-2021 01:07 gmt revision:5 [4] [3] [2] [1] [0] [head]

This was compiled from searching papers which referenced Olshausen and Field 1996 PMID-8637596 Emergence of simple-cell receptive field properties by learning a sparse code for natural images.

hide / / print
ref: -1992 tags: Linsker infomax Hebbian anti-hebbian linear perceptron unsupervised learning date: 08-04-2021 00:20 gmt revision:2 [1] [0] [head]

Local synaptic learning rules suffice to maximize mutual information in a linear network

  • Ralph Linsker, 1992.
  • A development upon {1545} -- this time with lateral inhibition trained through noise-contrast and anti-Hebbian plasticity.
  • {1545} does not perfectly maximize the mutual information between the input and output -- this allegedly requires the inverse of the covariance matrix, QQ .
    • As before, infomax principles; maximize mutual information MIH(Z)H(Z|S)MI \propto H(Z) - H(Z | S) where Z is the network output and S is the signal input. (note: minimize the conditional entropy of output given the input).
    • For a gaussian variable, H=12lndetQH = \frac{ 1}{ 2} ln det Q where Q is the covariance matrix. In this case Q=E|ZZ T|Q = E|Z Z^T |
    • since Z=C(S,N)Z = C(S,N) where C are the weights, S is the signal, and N is the noise, Q=CqC T+rQ = C q C^T + r where q is the covariance matrix of input noise and r is the cov.mtx. of the output noise.
    • (somewhat confusing): δH/δC=Q 1Cq\delta H / \delta C = Q^{-1}Cq
      • because .. the derivative of the determinant is complicated.
      • Check the appendix for the derivation. lndetQ=TrlnQln det Q = Tr ln Q and dH=1/2d(TrlnQ)=1/2Tr(Q 1dQ) dH = 1/2 d(Tr ln Q) = 1/2 Tr( Q^-1 dQ ) -- this holds for positive semidefinite matrices like Q.

  • From this he comes up with a set of rules whereby feedforward weights are trained in a Hebbian fashion, but based on activity after lateral activation.
  • The lateral activation has a weight matrix F=IαQF = I - \alpha Q (again Q is the cov.mtx. of Z). If y(0)=Y;y(t+1)=Y+Fy(t)y(0) = Y; y(t+1) = Y + Fy(t) , where Y is the feed-forward activation, then αy(inf)=Q 1Y\alpha y(\inf) = Q^{-1}Y . This checks out:
x = randn(1000, 10);
Q = x' * x;
a = 0.001;
Y = randn(10, 1);
y = zeros(10, 1); 
for i = 1:1000
	y = Y + (eye(10) - a*Q)*y;

y - pinv(Q)*Y / a % should be zero. 
  • This recursive definition is from Jacobi. αy(inf)=αΣ t=0 infF tY=α(IF) 1Y=Q 1Y\alpha y(\inf) = \alpha \Sigma_{t=0}^{\inf}F^tY = \alpha(I - F)^{-1} Y = Q^{-1}Y .
  • Still, you need to estimate Q through a running-average, ΔQ=1M(Y nY m+r nmQ NM)\Delta Q = \frac{ 1}{M}( Y_n Y_m + r_{nm} - Q_{NM} ) and since F=IαQF = I - \alpha Q , F is formed via anti-hebbian terms.

To this is added a 'sensing' learning and 'noise' unlearning phase -- one optimizes H(Z)H(Z) , the other minimizes H(Z|S)H(Z|S) . Everything is then applied, similar to before, to a gaussian-filtered one-dimensional white-noise stimuli. He shows this results in bandpass filter behavior -- quite weak sauce in an era where ML papers are expected to test on five or so datasets. Even if this was 1992 (nearly forty years ago!), it would have been nice to see this applied to a more realistic dataset; perhaps some of the following papers? Olshausen & Field came out in 1996 -- but they applied their algorithm to real images.

In both Olshausen & this work, no affordances are made for multiple layers. There have to be solutions out there...

hide / / print
ref: -1988 tags: Linsker infomax linear neural network hebbian learning unsupervised date: 08-03-2021 06:12 gmt revision:2 [1] [0] [head]

Self-organizaton in a perceptual network

  • Ralph Linsker, 1988.
  • One of the first (verbose, slightly diffuse) investigations of the properties of linear projection neurons (e.g. dot-product; no non-linearity) to express useful tuning functions.
  • ''Useful' is here information-preserving, in the face of noise or dimensional bottlenecks (like PCA).
  • Starts with Hebbian learning functions, and shows that this + white-noise sensory input + some local topology, you can get simple and complex visual cell responses.
    • Ralph notes that neurons in primate visual cortex are tuned in utero -- prior real-world visual experience! Wow. (Who did these studies?)
    • This is a very minimalistic starting point; there isn't even structured stimuli (!)
    • Single neuron (and later, multiple neurons) are purely feed-forward; author cautions that a lack of feedback is not biologically realistic.
      • Also note that this was back in the Motorola 680x0 days ... computers were not that powerful (but certainly could handle more than 1-2 neurons!)
  • Linear algebra shows that Hebbian synapses cause a linear layer to learn the covariance function of their inputs, QQ , with no dependence on the actual layer activity.
  • When looked at in terms of an energy function, this is equivalent to gradient descent to maximize the layer-output variance.
  • He also hits on:
    • Hopfield networks,
    • PCA,
    • Oja's constrained Hebbian rule δw i<L 2(L 1L 2w i)> \delta w_i \propto &lt; L_2(L_1 - L_2 w_i) &gt; (that is, a quadratic constraint on the weight to make Σw 21\Sigma w^2 \sim 1 )
    • Optimal linear reconstruction in the presence of noise
    • Mutual information between layer input and output (I found this to be a bit hand-wavey)
      • Yet he notes critically: "but it is not true that maximum information rate and maximum activity variance coincide when the probability distribution of signals is arbitrary".
        • Indeed. The world is characterized by very non-Gaussian structured sensory stimuli.
    • Redundancy and diversity in 2-neuron coding model.
    • Role of infomax in maximizing the determinant of the weight matrix, sorta.

One may critically challenge the infomax idea: we very much need to (and do) throw away spurious or irrelevant information in our sensory streams; what upper layers 'care about' when making decisions is certainly relevant to the lower layers. This credit-assignment is neatly solved by backprop, and there are a number 'biologically plausible' means of performing it, but both this and infomax are maybe avoiding the problem. What might the upper layers really care about? Likely 'care about' is an emergent property of the interacting local learning rules and network structure. Can you search directly in these domains, within biological limits, and motivated by statistical reality, to find unsupervised-learning networks?

You'll still need a way to rank the networks, hence an objective 'care about' function. Sigh. Either way, I don't per se put a lot of weight in the infomax principle. It could be useful, but is only part of the story. Otherwise Linsker's discussion is accessible, lucid, and prescient.


hide / / print
ref: -2019 tags: backprop neural networks deep learning coordinate descent alternating minimization date: 07-21-2021 03:07 gmt revision:1 [0] [head]

Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

  • This paper is sort-of interesting: rather than back-propagating the errors, you optimize auxiliary variables, pre-nonlinearity 'codes' in a last-to-first layer order. The optimization is done to minimize a multimodal logistic loss function; math is not done to minimize other loss functions, but presumably this is not a limit. The loss function also includes a quadratic term on the weights.
  • After the 'codes' are set, optimization can proceed in parallel on the weights. This is done with either straight SGD or adaptive ADAM.
  • Weight L2 penalty is scheduled over time.

This is interesting in that the weight updates can be cone in parallel - perhaps more efficient - but you are still propagating errors backward, albeit via optimizing 'codes'. Given the vast infractructure devoted to auto-diff + backprop, I can't see this being adopted broadly.

That said, the idea of alternating minimization (which is used eg for EM clustering) is powerful, and this paper does describe (though I didn't read it) how there are guarantees on the convexity of the alternating minimization. Likewise, the authors show how to improve the performance of the online / minibatch algorithm by keeping around memory variables, in the form of covariance matrices.

hide / / print
ref: -0 tags: gpu burn stress test github cuda date: 07-13-2021 21:32 gmt revision:0 [head]

https://github.com/wilicc/gpu-burn Mult-gpu stress test.

Are your GPUs overclocked to the point of overheating / being unreliable?

hide / / print
ref: -0 tags: machine learning blog date: 04-22-2021 15:43 gmt revision:0 [head]

Paper notes by Vitaly Kurin

Like this blog but 100% better!

hide / / print
ref: -2020 tags: feedback alignment local hebbian learning rules stanford date: 04-22-2021 03:26 gmt revision:0 [head]

Two Routes to Scalable Credit Assignment without Weight Symmetry

This paper looks at five different learning rules, three purely local, and two non-local, to see if they can work as well as backprop in training a deep convolutional net on ImageNet. The local learning networks all feature forward weights W and backward weights B; the forward weights (+ nonlinearities) pass the information to lead to a classification; the backward weights pass the error, which is used to locally adjust the forward weights.

Hence, each fake neuron has locally the forward activation, the backward error (or loss gradient), the forward weight, backward weight, and Hebbian terms thereof (e.g the outer product of the in-out vectors for both forward and backward passes). From these available variables, they construct the local learning rules:

  • Decay (exponentially decay the backward weights)
  • Amp (Hebbian learning)
  • Null (decay based on the product of the weight and local activation. This effects a Euclidean norm on reconstruction.

Each of these serves as a "regularizer term" on the feedback weights, which governs their learning dynamics. In the case of backprop, the backward weights B are just the instantaneous transpose of the forward weights W. A good local learning rule approximates this transpose progressively. They show that, with proper hyperparameter setting, this does indeed work nearly as well as backprop when training a ResNet-18 network.

But, hyperparameter settings don't translate to other network topologies. To allow this, they add in non-local learning rules:

  • Sparse (penalizes the Euclidean norm of the previous layer; gradient is the outer product of the (current layer activation &transpose) * B)
  • Self (directly measures the forward weights and uses them to update the backward weights)

In "Symmetric Alignment", the Self and Decay rules are employed. This is similar to backprop (the backward weights will track the forward ones) with L2 regularization, which is not new. It performs very similarly to backprop. In "Activation Alignment", Amp and Sparse rules are employed. I assume this is supposed to be more biologically plausible -- the Hebbian term can track the forward weights, while the Sparse rule regularizes and stabilizes the learning, such that overall dynamics allow the gradient to flow even if W and B aren't transposes of each other.

Surprisingly, they find that Symmetric Alignment to be more robust to the injection of Gaussian noise during training than backprop. Both SA and AA achieve similar accuracies on the ResNet benchmark. The authors then go on to explain the plausibility of non-local but approximate learning rules with Regression discontinuity design ala Spiking allows neurons to estimate their causal effect.

This is a decent paper,reasonably well written. They thought trough what variables are available to affect learning, and parameterized five combinations that work. Could they have done the full matrix of combinations, optimizing just they same as the metaparameters? Perhaps, but that would be even more work ...

Regarding the desire to reconcile backprop and biology, this paper does not bring us much (if at all) closer. Biological neural networks have specific and local uses for error; even invoking 'error' has limited explanatory power on activity. Learning and firing dynamics, of course of course. Is the brain then just an overbearing mess of details and overlapping rules? Yes probably but that doesn't mean that we human's can't find something simpler that works. The algorithms in this paper, for example, are well described by a bit of linear algebra, and yet they are performant.

hide / / print
ref: -0 tags: saab EPC date: 03-22-2021 01:29 gmt revision:0 [head]

https://webautocats.com/epc/saab/sbd/ -- Online, free parts look-up for Saab cars. Useful.

hide / / print
ref: -2010 tags: neural signaling rate code patch clamp barrel cortex date: 03-18-2021 18:41 gmt revision:0 [head]

PMID-20596024 Sensitivity to perturbations in vivo implies high noise and suggests rate coding in cortex

  • How did I not know of this paper before.
  • Solid study showing that, while a single spike can elicit 28 spikes in post-synaptic neurons, the associated level of noise is indistinguishable from intrinsic noise.
  • Hence the cortex should communicate / compute in rate codes or large synchronized burst firing.
    • They found large bursts to be infrequent, timing precision to be low, hence rate codes.
    • Of course other examples, e.g auditory cortex, exist.

Cortical reliability amid noise and chaos

  • Noise is primarily of synaptic origin. (Dropout)
  • Recurrent cortical connectivity supports sensitivity to precise timing of thalamocortical inputs.

hide / / print
ref: -0 tags: cortical computation learning predictive coding reviews date: 02-23-2021 20:15 gmt revision:2 [1] [0] [head]

PMID-30359606 Predictive Processing: A Canonical Cortical Computation

  • Georg B Keller, Thomas D Mrsic-Flogel
  • Their model includes on two error signals: positive and negative for reconciling the sensory experience with the top-down predictions. I haven't read the full article, and disagree that such errors are explicit to the form of neurons, but the model is plausible. Hence worth recording the paper here.

PMID-23177956 Canonical microcircuits for predictive coding

  • Andre M Bastos, W Martin Usrey, Rick A Adams, George R Mangun, Pascal Fries, Karl J Friston
  • We revisit the established idea that message passing among hierarchical cortical areas implements a form of Bayesian inference-paying careful attention to the implications for intrinsic connections among neuronal populations.
  • Have these algorithms been put to practical use? I don't know...

Control of synaptic plasticity in deep cortical networks

  • Pieter R. Roelfsema & Anthony Holtmaat
  • Basically argue for a many-factor learning rule at the feedforward and feedback synapses, taking into account pre, post, attention, and reinforcement signals.
  • See comment by Tim Lillicrap and Blake Richards.

hide / / print
ref: -0 tags: protein engineering structure evolution date: 02-23-2021 19:57 gmt revision:1 [0] [head]

From Protein Structure to Function with Bioinformatics

  • Dense and useful resource!
  • Few new folds have been discovered since 2010 -- the total number of extand protein folds is around 100,000. Evolution re-uses existing folds + the protein fold space is highly convergent. Amazing. link

hide / / print
ref: -2013 tags: larkum calcium spikes dendrites association cortex binding date: 02-23-2021 19:52 gmt revision:3 [2] [1] [0] [head]

PMID-23273272 A cellular mechanism for cortical associations: and organizing principle for the cerebral cortex

  • Distal tuft dendrites have a second spike-initiation zone, where depolarization can induce a calcium plateau of up to 50ms long.  This depolarization can cause multiple spikes in the soma, and can be more effective at inducing spikes than depolarization through the basal dendrites.  Such spikes are frequently bursts of 2-4 at 200hz. 
  • Bursts of spikes can also be triggered by backpropagation activated calcium (BAC), which can half the current threshold for a dendritic spike. That is, there is enough signal propagation for information to propagate both down the dendritic arbor and up, and the two interact non-linearly.  
  • This nonlinear calcium-dependent association pairing can be blocked by inhibition to the dendrites (presumably apical?). 
    • Larkum argues that the different timelines of GABA inhibition offer 'exquisite control' of the dendrites; but these sorts of arguments as to computational power always seem lame compared to stating what their actual role might be. 
  • Quote: "Dendritic calcium spikes have been recorded in vivo [57, 84, 85] that correlate to behavior [78, 86].  The recordings are population-level, though, and do not seem to measure individual dendrites (?). 

See also:

PMID-25174710 Sensory-evoked LTP driven by dendritic plateau potentials in vivo

  • We demonstrate that rhythmic sensory whisker stimulation efficiently induces synaptic LTP in layer 2/3 (L2/3) pyramidal cells in the absence of somatic spikes.
  • It instead depends on NMDA-dependent dendritic spikes.
  • And this is dependent on afferents from the POm thalamus.

And: The binding solution?, a blog post covering Bittner 2015 that looks at rapid dendritic plasticity in the hippocampus as a means of binding stimuli to place fields.

hide / / print
ref: -0 tags: tennenbaum compositional learning character recognition one-shot learning date: 02-23-2021 18:56 gmt revision:2 [1] [0] [head]

One-shot learning by inverting a compositional causal process

  • Brenden Lake, Russ Salakhutdinov, Josh Tennenbaum
  • This is the paper that preceded the 2015 Science publication "Human level concept learning through probabalistic program induction"
  • Because it's a NIPS paper, and not a science paper, this one is a bit more accessible: the logic to the details and developments is apparent.
  • General idea: build up a fully probabilistic model of multi-language (omniglot corpus) characters / tokens. This model includes things like character type / alphabet, number of strokes, curvature of strokes (parameterized via bezier splines), where strokes attach to others (spatial relations), stroke scale, and character scale. The model (won't repeat the formal definition) is factorized to be both compositional and causal, though all the details of the conditional probs are left to the supplemental material.
  • They fit the complete model to the Omniglot data using gradient descent + image-space noising, e.g tweak the free parameters of the model to generate images that look like the human created characters. (This too is in the supplement).
  • Because the model is high-dimensional and hard to invert, they generate a perceptual model by winnowing down the image into a skeleton, then breaking this into a variable number of strokes.
    • The probabilistic model then assigns a log-likelihood to each of the parses.
    • They then use the model with Metropolis-Hastings MCMC to sample a region in parameter space around each parse -- and they extra sample ψ\psi (the character type) to get a greater weighted diversity of types.
      • Surprisingly, they don't estimate the image likelihood - which is expensive - they here just re-do the parsing based on aggregate info embedded in the statistical model. Clever.
  • ψ\psi is the character type (a, b, c..), ψ=κ,S,R\psi = { \kappa, S, R } where kappa are the number of strokes, S is a set of parameterized strokes, R are the relations between strokes.
  • θ\theta are the per-token stroke parameters.
  • II is the image, obvi.
  • Classification task: one image of a new character (c) vs 20 characters new characters from the same alphabet (test, (t)). In the 20 there is one character of the same type -- task is to find it.
  • With 'hierarchical bayesian program learning', they not only anneal the type to the parameters (with MCMC, above) for the test image, but they also fit the parameters using gradient descent to the image.
    • Subsequently parses the test image onto the class image (c)
    • Hence the best classification is the one where both are in the best agreement: argmaxcP(c|t)P(c)P(t|c)\underset{c}{argmax} \frac{P(c|t)}{P(c)} P(t|c) where P(c)P(c) is approximated as the parse weights.
      • Again, this is clever as it allows significant information leakage between (c) and (t) ...
      • The other models (Affine, Deep Boltzman Machines, Hierarchical Deep Model) have nothing like this -- they are feed-forward.
  • No wonder HBPL performs better. It's a better model of the data, that has a bidirectional fitting routine.

  • As i read the paper, had a few vague 'hedons':
    • Model building is essential. But unidirectional models are insufficient; if the models include the mechanism for their own inversion many fitting and inference problems are solved. (Such is my intuition)
      • As a corrolary of this, having both forward and backward tags (links) can be used to neatly solve the binding problem. This should be easy in a computer w/ pointers, though in the brain I'm not sure how it might work (?!) without some sort of combinatorial explosion?
    • The fitting process has to be multi-pass or at least re-entrant. Both this paper and the Vicarious CAPTCHA paper feature statistical message passing to infer or estimate hidden explanatory variables. Seems correct.
    • The model here includes relations that are conditional on stroke parameters that occurred / were parsed beforehand; this is very appealing in that the model/generator/AI needs to be flexibly re-entrant to support hierarchical planning ...

hide / / print
ref: -0 tags: neuronal assemblies maass hebbian plasticity simulation austria fMRI date: 02-23-2021 18:49 gmt revision:1 [0] [head]

PMID-32381648 A model for structured information representation in neural networks in the brain

  • Using randomly connected E/I networks, suggests that information can be "bound" together using fast Hebbian STDP.
  • That is, 'assemblies' in higher-level areas reference sensory information through patterns of bidirectional connectivity.
  • These patterns emerge spontaneously following disinihbition of the higher-level areas.
  • Find the results underwhelming, but the discussion is more interesting.
    • E.g. there have been a lot of theoretical and computational-experimental work for how concepts are bound together into symbols or grammars.
    • The referenced fMRI studies are interesting, too: they imply that you can observe the results of structural binding in activity of the superior temporal gyrus.
  • I'm more in favor of dendritic potentials or neuronal up/down states to be a fast and flexible way of maintaining 'symbol membership' --
    • But it's not as flexible as synaptic plasticity, which, obviously, populates the outer product between 'region a' and 'region b' with a memory substrate, thereby spanning the range of plausible symbol-bindings.
    • Inhibitory interneurons can then gate the bindings, per morphological evidence.
    • But but, I don't think anyone has shown that you need protein synthesis for perception, as you do for LTP (modulo AMPAR cycling).
      • Hence I'd argue that localized dendritic potentials can serve as the flexible outer-product 'memory tag' for presence in an assembly.
        • Or maybe they are used primarily for learning, who knows!

hide / / print
ref: -2019 tags: deep double descent lottery ticket date: 02-23-2021 18:47 gmt revision:2 [1] [0] [head]

Reconciling modern machine-learning practice and the classical bias–variance trade-off

A formal publication of the effect famously discovered at OpenAI & publicized on their blog. Goes into some details on fourier features & runs experiments to verify the OpenAI findings. The result stands.

An interesting avenue of research is using genetic algorithms to perform the search over neural network parameters (instead of backprop) in reinforcement-learning tasks. Ben Phillips has a blog post on some of the recent results, which show that it does work for certain 'hard' problems in RL. Of course, this is the dual of the 'lottery ticket' hypothesis and the deep double descent, above; large networks are likely to have solutions 'close enough' to solve a given problem.

That said, genetic algorithms don't necessarily perform gradient descent to tweak the weights for optimal behaviror once they are within the right region of RL behavior. See {1530} for more discussion on this topic, as well as {1525} for a more complete literature survey.

hide / / print
ref: -2020 tags: current opinion in neurobiology Kriegeskorte review article deep learning neural nets circles date: 02-23-2021 17:40 gmt revision:2 [1] [0] [head]

Going in circles is the way forward: the role of recurrence in visual inference

I think the best part of this article are the references -- a nicely complete listing of, well, the current opinion in Neurobiology! (Note that this issue is edited by our own Karel Svoboda, hence there are a good number of Janelians in the author list..)

The gestalt of the review is that deep neural networks need to be recurrent, not purely feed-forward. This results in savings in overall network size, and increase in the achievable computational complexity, perhaps via the incorporation of priors and temporal-spatial information. All this again makes perfect sense and matches my sense of prevailing opinion. Of course, we are left wanting more: all this recurrence ought to be structured in some way.

To me, a rather naive way of thinking about it is that feed-forward layers cause weak activations, which are 'amplified' or 'selected for' in downstream neurons. These neurons proximally code for 'causes' or local reasons, based on the supported hypothesis that the brain has a good temporal-spatial model of the visuo-motor world. The causes then can either explain away the visual input, leading to balanced E-I, or fail to explain it, in which the excess activity is either rectified by engaging more circuits or engaging synaptic plasticity.

A critical part of this hypothesis is some degree of binding / disentanglement / spatio-temporal re-assignment. While not all models of computation require registers / variables -- RNNs are Turning-complete, e.g., I remain stuck on the idea that, to explain phenomenological experience and practical cognition, the brain much have some means of 'binding'. A reasonable place to look is the apical tuft dendrites, which are capable of storing temporary state (calcium spikes, NMDA spikes), undergo rapid synaptic plasticity, and are so dense that they can reasonably store the outer-product space of binding.

There is mounting evidence for apical tufts working independently / in parallel is investigations of high-gamma in ECoG: PMID-32851172 Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. "High gamma" shows little correlation with MUA when you differentiate early-deep and late-superficial responses, "consistent with the view it reflects dendritic processing separable from local neuronal firing"

hide / / print
ref: -2009 tags: Baldwin effect finches date: 02-22-2021 17:35 gmt revision:0 [head]

Evolutionary significance of phenotypic accommodation in novel environments: an empirical test of the Baldwin effect

Up until reading this, I had thought that the Balwin effect refers to the fact that when animals gain an ability to learn, this allows them to take new ecological roles without genotypic adaptation. This is a component of the effect, but is not the original meaning, which is opposite: when species adapt to a novel environment through phenotypic adptation (say adapting to colder weather through within-lifetime variation), evolution tends to push these changes into the germ line. This is something to the effect of Lamarkian evolution.

In the case of house finches, as discussed in the link above, this pertains to increased brood variability and sexual dimorphism due to varied maternal habits and hormones due to environmental stress. This variance is then rapidly operated on by natural selection to tune the finch to it's new enviroment, including Montana, where the single author did most of his investigation.

There are of course countless other details here, but still this is an illuminating demonstration of how evolution works to move information into the genome.

hide / / print
ref: -2013 tags: synaptic learning rules calcium harris stdp date: 02-18-2021 19:48 gmt revision:3 [2] [1] [0] [head]

PMID-24204224 The Convallis rule for unsupervised learning in cortical networks 2013 - Pierre Yger  1 , Kenneth D Harris

This paper aims to unify and reconcile experimental evidence of in-vivo learning rules with  established STDP rules.  In particular, the STDP rule fails to accurately predict change in strength in response to spike triplets, e.g. pre-post-pre or post-pre-post.  Their model instead involves the competition between two time-constant threshold circuits / coincidence detectors, one which controls LTD and another LTP, and is such an extension of the classical BCM rule.  (BCM: inputs below a threshold will weaken a synapse; those above it will strengthen. )

They derive the model from optimization criteria that neurons should try to optimize the skewedness of the distribution of their membrane potential: much time spent either firing spikes or strongly inhibited.  This maps to a objective function F that looks like a valley - hence the 'convallis' in the name (latin for valley); the objective is differentiated to yield a weighting function for weight changes; they also add a shrinkage function (line + heaviside function) to gate weight changes 'off' at resting membrane potential. 

A network of firing neurons successfully groups correlated rate-encoded inputs, better than the STDP rule.  it can also cluster auditory inputs of spoken digits converted into cochleogram.  But this all seems relatively toy-like: of course algorithms can associate inputs that co-occur.  The same result was found for a recurrent balanced E-I network with the same cochleogram, and convalis performed better than STDP.   Meh.

Perhaps the biggest thing I got from the paper was how poorly STDP fares with spike triplets:

Pre following post does not 'necessarily' cause LTD; it's more complicated than that, and more consistent with the two different-timeconstant coincidence detectors.  This is satisfying as it allows for apical dendritic depolarization to serve as a contextual binding signal - without negatively impacting the associated synaptic weights. 

hide / / print
ref: -2017 tags: deep neuroevolution jeff clune Uber genetic algorithms date: 02-18-2021 18:27 gmt revision:1 [0] [head]

Deep Neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning* Uber AI labs; Jeff Clune.

  • In this paper, they used a (fairly generic) genetic algorithm to tune the weights of a relatively large (4M parameters) convolutional neural net to play 13 atari games. 
  • The GA used truncation selection, population of ~ 1k individuals, no crossover, and gaussian mutation.
  • To speed up and streamline this algo, they encoded the weights not directly but as an initialization seed to the RNG (log2 of the number of parameters, approximately), plus seeds to generate the per-generation mutation (~ 28 bits).  This substantially decreased the required storage space and communication costs when running the GA in parallel on their cluster; they only had to transmit the rng seed sequence. 
  • Quite surprisingly, the GA was good at typically 'hard' games like frostbite and skiing, whereas it fared poorly on games like atlantis (which is a fixed-gun shooter game) and assault. 
  • Performance was compared to Deep-Q-networks (DQN), Evolutionary search (which used stochastic gradient approximates), Asynchronous Advantage Actor-critic (A3C), and random search (RS)
  • They surmise that some games were thought to be hard, but are actually fairly easy, albeit with many local minima. This is why search around the origin (near the initialization of the networks, which was via the Xavier method) is sufficient to solve the tasks.
  • Also noted that frequently the GA would find individuals with good performance in ~10 generations, further supporting the point above. 
  • The GA provide very consistent performance across the entirety of a trial, which, they suggest, may offer a cleaner signal to selection as to the quality of each of the individuals (debatable!).
  • Of course, for some tasks, the GA fails woefully; it was not able to quickly learn to control a humanoid robot, which involves mapping a ~370-dimensional vector into ~17 joint torques.  Evolutionary search was able to perform this task, which is not surprising as the gradient here should be smooth.

The result is indeed surprising, but it also feels lazy -- the total effort or information that they put into writing the actual algorithm is small; as mentioned in the introduction, this is a case of old algorithms with modern levels of compute.  Analogously, compare Go-Explore, also by Uber AI labs, vs Agent57 by DeepMind; the Agent57 paper blithely dismisses the otherwise breathless Go-Explore result as feature engineering and unrealistic free backtracking / game-resetting (which is true..) It's strange that they did not incorporate crossover aka recombination, as David MacKay clearly shows that recombination allows for much higher mutation rates and much better transmission of information through a population.  (Chapter 'Why have sex').  They also perhaps more reasonably omit developmental encoding, where network weights are tied or controlled through development, again in an analogy to biology. 

A better solution, as they point out, would be some sort of hybrid GA / ES / A3C system which used both gradient-based tuning, random stochastic gradient-based exploration, and straight genetic optimization, possibly all in parallel, with global selection as the umbrella.  They mention this, but to my current knowledge this has not been done. 

hide / / print
ref: -2015 tags: olshausen redwood autoencoder VAE MNIST faces variation date: 11-27-2020 03:04 gmt revision:0 [head]

Discovering hidden factors of variation in deep networks

  • Well, they are not really that deep ...
  • Use a VAE to encode both a supervised signal (class labels) as well as unsupervised latents.
  • Penalize a combination of the MSE of reconstruction, logits of the classification error, and a special cross-covariance term to decorrelate the supervised and unsupervised latent vectors.
  • Cross-covariance penalty:
  • Tested on
    • MNIST -- discovered style / rotation of the characters
    • Toronto faces database -- seven expressions, many individuals; extracted eigen-emotions sorta.
    • Multi-PIE --many faces, many viewpoints ; was able to vary camera pose and illumination with the unsupervised latents.

hide / / print
ref: -0 tags: inductive logic programming deepmind formal propositions prolog date: 11-21-2020 04:07 gmt revision:0 [head]

Learning Explanatory Rules from Noisy Data

  • From a dense background of inductive logic programming (ILP): given a set of statements, and rules for transformation and substitution, generate clauses that satisfy a set of 'background knowledge'.
  • Programs like Metagol can do this using search and simplify logic built into Prolog.
    • Actually kinda surprising how very dense this program is -- only 330 lines!
  • This task can be transformed into a SAT problem via rules of logic, for which there are many fast solvers.
  • The trick here (instead) is that a neural network is used to turn 'on' or 'off' clauses that fit the background knowledge
    • BK is typically very small, a few examples, consistent with the small size of the learned networks.
  • These weight matrices are represented as the outer product of composed or combined clauses, which makes the weight matrix very large!
  • They then do gradient descent, while passing the cross-entropy errors through nonlinearities (including clauses themselves? I think this is how recursion is handled.) to update the weights.
    • Hence, SGD is used as a means of heuristic search.
  • Compare this to Metagol, which is brittle to any noise in the input; unsurprisingly, due to SGD, this is much more robust.
  • Way too many words and symbols in this paper for what it seems to be doing. Just seems to be obfuscating the work (which is perfectly good). Again: Metagol is only 330 lines!

hide / / print
ref: -2011 tags: two photon cross section fluorescent protein photobleaching Drobizhev gcamp date: 11-04-2020 18:07 gmt revision:9 [8] [7] [6] [5] [4] [3] [head]

PMID-21527931 Two-photon absorption properties of fluorescent proteins

  • Significant 2-photon cross section of red fluorescent proteins (same chromophore as DsRed) in the 700 - 770nm range, accessible to Ti:sapphire lasers ...
    • This corresponds to a S 0S nS_0 \rightarrow S_n transition
    • But but, photobleaching is an order of magnitude slower when excited by the direct S 0S 1S_0 \rightarrow S_1 transition (but the fluorophores can be significantly less bright in this regime).
      • Quote: the photobleaching of DsRed slows down by an order of magnitude when the excitation wavelength is shifted to the red, from 750 to 950 nm (32).
    • See also PMID-18027924
  • Further work by same authors: Absolute Two-Photon Absorption Spectra and Two-Photon Brightness of Orange and Red Fluorescent Proteins
    • " TagRFP possesses the highest two-photon cross section, σ2 = 315 GM, and brightness, σ2φ = 130 GM, where φ is the fluorescence quantum yield. At longer wavelengths, 1000–1100 nm, tdTomato has the largest values, σ2 = 216 GM and σ2φ = 120 GM, per protein chain. Compared to the benchmark EGFP, these proteins present 3–4 times improvement in two-photon brightness."
    • "Single-photon properties of the FPs are poor predictors of which fluorescent proteins will be optimal in two-photon applications. It follows that additional mutagenesis efforts to improve two-photon cross section will benefit the field."
  • 2P cross-section in both the 700-800nm and 1000-1100 nm range corresponds to the chromophore polarizability, and is not related to 1p cross section.
  • This can be useflu for multicolor imaging: excitation of the higher S0 → Sn transition of TagRFP simultaneously with the first, S0 → S1, transition of mKalama1 makes dual-color two-photon imaging possible with a single excitation laser wavelength (13)
  • Why are red GECIs based on mApple (rGECO1) or mRuby (RCaMP)? dsRed2 or TagRFP are much better .. but maybe they don't have CP variants.
  • from https://elifesciences.org/articles/12727

hide / / print
ref: -0 tags: double descent complexity construction gradient descent date: 10-26-2020 03:23 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Why deep learning works even though it shouldn't, instigated a fun thread thinking about "complexity of model" vs "complexity of solution".

  • The blog post starts from the position that modern deep learning should not work because the models are much too complex for the datasets they are trained on -- they should not generalize well.
    • Quote" why do models get better when they are bigger and deeper, even when the amount of data they consume stays the same or gets smaller."
  • Argument: in high-dimensional spaces, all solutions are about the same distance from each other. This means that high dimensional spaces are very well connected. (Seems hand-wavy?)
    • Sub-argument: with bilions of dimensions, it is exponentially unlikely that all gradients will be positive, e.g. you are in a local minimum. Much more likely that about half are positive, half are negative -> saddle.
    • This is of course looking at it in terms of gradient descent, which is not probably how biological systems build complexity. See also the saddle paper.
  • Claim: Early stopping is better regularization than any hand-picked a priori regularization, including implicit regularization like model size.
    • Well, maybe; stopping early of course is normally thought to prevent over-fitting or over-memorization of the dataset; but see also Double Descent, below.
    • Also: "that weight distributions are highly non-independent even after only a few hundred iterations" abstract of The Early phase of Neural Network Training
    • Or: "We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e.g., random data order and augmentation). We find that standard vision models become stable to SGD noise in this way early in training. From then on, the outcome of optimization is determined to a linearly connected region. "
  • Claim: SGD, ADAM, etc does not train to a minimum.
    • I think this is broadly supportable via the high-dimensional saddle argument.
    • He relates this to distillation: a large model can infer 'good structure', possibly via the good luck of having a very large parameter space; a small model can learn these features with fewer parameters, and hopefully there will be less 'nuisance' dimensions in the distilled data.
  • discussion on Hacker News

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

  • This paper, well written & insightful.
  • Core idea: train up a network, not necessarily to completion or zero test error.
  • Prune away the smallest ~90% of the weights. Pruning is not at all a new idea.
    • For larger networks, they propose iterative pruning: train for a while, prune away connections that don't matter, continue.
      • Does this sound like human neural development? Yes!
  • Re-start training from the initial weights, with most of the network pruned away. This network will train up faster, to equivalent accuracy, compared to orinial full network.
  • This seems to work well for MNIST and CIFAR10.
  • From this, they hypothesize that within a large network there is a 'lottery ticket' sub-network that can be trained well to represent the training / test dataset well.
    • "The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective"
  • However, either pruning the network (setting the weights to zero) before training, or re-initializing the weights in the trained network from the initialization distribution does not work.
  • "Dense, randomly-initialized networks are easier to train than the sparse networks that result from pruning because there are more possible subnetworks from which training might recover a winning ticket"
    • The blessing of dimensionality!
  • Complementary with dropout, at least the iterative pruning regime.
  • But only with a slow learning rate (?) or learning rate warmup for deeper nets.
  • Very complete appendix, as neccessitated by the submission to ICLR. Within it there is a little The truth wears off effect (or: more caves of complexity)

Stabilizing the lottery ticket hypothesis

  • With deeper neural networks, you can't prune away most of the weights before at least some training has occurred.
  • Instead, train the network partly, then do iterative magnitude pruning to select important weights.
  • Even with early training, this works well up to 80% sparsity on Imagenet.
  • Given the previous results, this doesn't seem so surprising..

OpenAI Deep Double Descent

  • Original phenomena discovered in Reconciling modern machine learning practice and the bias-variance trade-off
  • Why is bigger always better?
  • Another well-written and easily understood post.
  • At the interpolation threshold, there are relatively few models that fit the training data well, and label noise can easily mess up their global structure; beyond this interpolation threshold, there are many good models, and SGD somehow has implicit bias (??) to select models that are parsimonious and hence generalize well.
    • This despite the fact that classical statistics would suggest that the models are very over-parameterized.
    • Maybe it's the noise (the S in SGD) which acts as a regularizer? That plus the fact that the networks imperfectly represent structure in the data?
      • When there is near-zero training error, what does SGD do ??
  • Understanding deep double descent
    • Quote: but it still leaves what is in my opinion the most important question unanswered, which is: what exactly are the magical inductive biases of modern ML that make interpolation work so well?
  • Alternate hypothesis, from lesser wrong: ensembling improves generalization. "Which is something we've known for a long time".
    • the peak of a flat minimum is a slightly better approximation for the posterior predictive distribution over the entire hypothesis class. Sometimes I even wonder if something like this explains why Occam’s Razor works...
      • That’s exactly correct. You can prove it via the Laplace approximation: the “width” of the peak in each principal direction is the inverse of an eigenvalue of the Hessian, and each eigenvalue λ i\lambda_i contributes 12log(λ i)-\frac{ 1}{ 2}log(\lambda_i) to the marginal log likelihood logP[data|model]log P[data|model] . So, if a peak is twice as wide in one direction, its marginal log likelihood is higher by 12log(2)\frac{ 1}{ 2}log(2) , or half a bit. For models in which the number of free parameters is large relative to the number of data points (i.e. the interesting part of the double-descent curve), this is the main term of interest in the marginal log likelihood.
      • Ensembling does not explain the lottery ticket hypothesis.

  • Critical learning periods in deep neural networks
    • Per above, it also does not explain this result -- that the trace of the Fisher Information Matrix goes up then down with training; the SGD consolidates the weights so that 'fewer matter'.
    • FIM, reminding myself: the expected value [ of the derivative [ of the log-likelihood function, f(data; parameters)]] , which is all a function of the parameters.
      • Expected value is taken over the data.
      • Derivative is with respect to the parameters. partial derivative = score; high score = data has a high local dependence on parameters, or equivalently, the parameters should be easier to estimate.
      • log-likelihood because that's the way it is; or: probabilities are best understood in decibels.

  • Understanding deep-learning requires re-thinking generalization
  • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals
  • state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data
    • 95.2% accuracy is still very surprising for a million random labels from 1000 categories.
    • Training time increases by a small scalar factor with random labels.
    • Regularization via weight decay, dropout, and data augmentation do not eliminate fitting of random labels.
  • Works even when the true images are replaced by random noise too.
  • Depth two neural networks have perfect sample expressivity as soon as parameters > data points.
  • The second bit of meat in the paper is section 5, Implicit regularization: an appeal to linear models.
  • Have n data points x i,y i{x_i,y_i} where x ix_i are d-dimensional feature vectors, and y iy_i are the labels.
    • if we want to solve the fitting problem, min w TelemR dΣ i=1 nloss(w Tx i,y i) min_{w^T \elem R^d} \Sigma_{i=1}^{n} loss(w^T x_i,y_i) -- this is just linear regression, and if d > n, can fit exactly.
    • The hessian of this function is degenerate -- the curvature is meaningless, and does not inform generalization.
  • With SGD, w t+1=w tηe tx i tw_{t+1} = w_t - \eta e_t x_{i_t} where e te_t is the prediction error.
  • If we start at w=0, w=Σ i=1 nα ix iw = \Sigma_{i=1}^{n} \alpha_i x_i for some coefficients α\alpha .
  • Hence, w=X Tαw = X^T \alpha -- the weights are in the span of the data points.
  • If we interpolate perfectly, then Xw=y X w = y
  • Substitute, and get XX Tα=y X X^T \alpha = y
    • This is the "kernel trick" (Scholkopf et al 2001)
    • Depends only on all the dot-products between all the datapoints -- it's a n*n linear system that can be solved exactly for small sets. (not pseudo-inverse!)
    • On mnist, this results in a 1.2% test error (!)
    • With gabor wavelet pre-processing, the the error is 0.6% !
  • Out of all models, SGD will converge to the model with the minimum norm (without weight decay)
    • Norm is only a small part of the generalization puzzle.

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited

Random deep neural networks are biased towards simple functions

Reconciling modern machine learning practice and the bias-variance trade-off

hide / / print
ref: -2020 tags: replay hippocampus variational autoencoder date: 10-11-2020 04:09 gmt revision:1 [0] [head]

Brain-inspired replay for continual learning with artificial neural networks

  • Gudo M van de Ven, Hava Siegelmann, Andreas Tolias
  • In the real world, samples are not replayed in shuffled order -- they occur in a sequence, typically few times. Hence, for training an ANN (or NN?), you need to 'replay' samples.
    • Perhaps, to get at hidden structure not obvious on first pass through the sequence.
    • In the brain, reactivation / replay likely to stabilize memories.
      • Strong evidence that this occurs through sharp-wave ripples (or the underlying activity associated with this).
  • Replay is also used to combat a common problem in training ANNs - catastrophic forgetting.
    • Generally you just re-sample from your database (easy), though in real-time applications, this is not possible.
      • It might also take a lot of memory (though that is cheap these days) or violate privacy (though again who cares about that)

  • They study two different classification problems:
    • Task incremental learning (Task-IL)
      • Agent has to serially learn distinct tasks
      • OK for Atari, doesn't make sense for classification
    • Class incremental learning (Class-IL)
      • Agent has to learn one task incrementally, one/few classes at a time.
      • Like learning a 2 digits at a time in MNIST
        • But is tested on all digits shown so far.
  • Solved via Generative Replay (GR, ~2017)
  • Use a recursive formulation: 'old' generative model is used to generate samples, which are then classified and fed, interleaved with the new samples, to the new network being trained.
    • 'Old' samples can be infrequent -- it's easier to reinforce an existing memory rather than create a new one.
    • Generative model is a VAE.
  • Compared with some existing solutions to catastrophic forgetting:
    • Methods to protect parameters in the network important for previous tasks
      • Elastic weight consolidation (EWC)
      • Synaptic intelligence (SI)
        • Both methods maintain estimates of how influential parameters were for previous tasks, and penalize changes accordingly.
        • "metaplasticity"
        • Synaptic intelligence: measure the loss change relative to the individual weights.
        • δL=δLδθδθδtδt \delta L = \int \frac{\delta L}{\delta \theta} \frac{\delta \theta}{\delta t} \delta t ; converted into discrete time / SGD: L=Σ kω k=ΣδLδθδθδtδt L = \Sigma_k \omega_k = \Sigma \int \frac{\delta L}{\delta \theta} \frac{\delta \theta}{\delta t} \delta t
        • ω k\omega_k are then the weightings for how much parameter change contributed to the training improvement.
        • Use this as a per-parameter regularization strength, scaled by one over the square of 'how far it moved'.
        • This is added to the loss, so that the network is penalized for moving important weights.
    • Context-dependent gating (XdG)
      • To reduce interference between tasks, a random subset of neurons is gated off (inhibition), depending on the task.
    • Learning without forgetting (LwF)
      • Method replays current task input after labeling them (incorrectly?) using the model trained on the previous tasks.
  • Generative replay works on Class-IL!
  • And is robust -- not to many samples or hidden units needed (for MNIST)

  • Yet the generative replay system does not scale to CIFAR or permuted MNIST.
  • E.g. if you take the MNIST pixels, permute them based on a 'task', and ask a network to still learn the character identities , it can't do it ... though synaptic intelligence can.
  • Their solution is to make 'brain-inspired' modifications to the network:
    • RtF, Replay-though-feedback: the classifier and generator network are fused. Latent vector is the hippocampus. Cortex is the VAE / classifier.
    • Con, Conditional replay: normal prior for the VAE is replaced with multivariate class-conditional Gaussian.
      • Not sure how they sample from this, check the methods.
    • Gat, Gating based on internal context.
      • Gating is only applied to the feedback layers, since for classification ... you don't a priori know the class!
    • Int, Internal replay. This is maybe the most interesting: rather than generating pixels, feedback generates hidden layer activations.
      • First layer of a network is convolutional, dependent on visual feature statistics, and should not change much.
        • Indeed, for CIFAR, they use pre-trained layers.
      • Internal replay proved to be very important!
    • Dist, Soft target labeling of the generated targets; cross-entropy loss when training the classifier on generated samples. Aka distillation.
  • Results suggest that regularization / metaplasticity (keeping memories in parameter space) and replay (keeping memories in function space) are complementary strategies,
    • And that the brain uses both to create and protect memories.

  • When I first read this paper, it came across as a great story -- well thought out, well explained, a good level of detail, and sufficiently supported by data / lesioning experiments.
  • However, looking at the first authors pub record, it seems that he's been at this for >2-3 years ... things take time to do & publish.
  • Folding in of the VAE is satisfying -- taking one function approximator and use it to provide memory for another function approximator.
  • Also satisfying are the neurological inspirations -- and that full feedback to the pixel level was not required!
    • Maybe the hippocampus does work like this, providing high-level feature vectors to the cortex.
    • And it's likely that the cortex has some features of a VAE, e.g. able to perceive and imagine through the same nodes, just run in different directions.
      • The fact that both concepts led to an engineering solution is icing on the cake!

hide / / print
ref: -2017 tags: schema networks reinforcement learning atari breakout vicarious date: 09-29-2020 02:32 gmt revision:2 [1] [0] [head]

Schema networks: zero-shot transfer with a generative causal model of intuitive physics

  • Like a lot of papers, the title has more flash than the actual results.
  • Results which would be state of the art (as of 2017) in playing Atari breakout, then transferring performance to modifications of the game (paddle moved up a bit, wall added in the middle of the bricks, brick respawning, juggling).
  • Schema network is based on 'entities' (objects) which have binary 'attributes'. These attributes can include continuous-valued signals, in which case each binary variable is like a place fields (i think).
    • This is clever an interesting -- rather than just low-level features pointing to high-level features, this means that high-level entities can have records of low-level features -- an arrow pointing in the opposite direction, one which can (also) be learned.
    • The same idea is present in other Vicarious work, including the CAPTCHA paper and more-recent (and less good) Bio-RNN paper.
  • Entities and attributes are propagated forward in time based on 'ungrounded schemas' -- basically free-floating transition matrices. The grounded schemas are entities and action groups that have evidence in observation.
    • There doesn't seem to be much math describing exactly how this works; only exposition. Or maybe it's all hand-waving over the actual, much simpler math.
      • Get the impression that the authors are reaching to a level of formalism when in fact they just made something that works for the breakout task... I infer Dileep prefers the empirical for the formal, so this is likely primarily the first author.
  • There are no perceptual modules here -- game state is fed to the network directly as entities and attributes (and, to be fair, to the A3C model).
  • Entity-attributes vectors are concatenated into a column vector length NTNT , where NN are the number of entities, and TT are time slices.
    • For each entity of N over time T, a row-vector is made of length MRMR , where MM are the number of attributes (fixed per task) and R1R-1 are the number of neighbors in a fixed radius. That is, each entity is related to its neighbors attributes over time.
    • This is a (large, sparse) binary matrix, XX .
  • yy is the vector of actions; task is to predict actions from XX .
    • How is X learned?? Very unclear in the paper vs. figure 2.
  • The solution is approximated as y=XW1¯y = X W \bar{1 } where WW is a binary weight matrix.
    • Minimize the solution based on an objective function on the error and the complexity of ww .
    • This is found via linear programming relaxation. "This procedure monotonically decreases the prediction error of the overall schema network, while increasing its complexity".
      • As it's a issue of binary conjunctions, this seems like a SAT problem!
    • Note that it's not probabilistic: "For this algorithm to work, no contradictions can exist in the input data" -- they instead remove them!
  • Actual behavior includes maximum-product belief propagation, to look for series of transitions that set the reward variable without setting the fail variable.
    • Because the network is loopy, this has to occur several times to set entity variables eg & includes backtracking.

  • Have there been any further papers exploring schema networks? What happened to this?
  • The later paper from Vicarious on zero-shot task transfer are rather less interesting (to me) than this.

hide / / print
ref: -2005 tags: dimensionality reduction contrastive gradient descent date: 09-13-2020 02:49 gmt revision:2 [1] [0] [head]

Dimensionality reduction by learning and invariant mapping

  • Raia Hadsell, Sumit Chopra, Yann LeCun
  • Central idea: learn and invariant mapping of the input by minimizing mapped distance (e.g. the distance between outputs) when the samples are categorized as the same (same numbers in MNIST eg), and maximizing mapped distance when the samples are categorized as distant.
    • Two loss functions for same vs different.
  • This is an attraction-repulsion spring analogy.
  • Use gradient descent to change the weights to satisfy these two competing losses.
  • Resulting constitutional neural nets can extract camera pose information from the NORB dataset.
  • Surprising how simple analogies like this, when iterated across a great many samples, pull out intuitively correct invariances.

hide / / print
ref: -2004 tags: neural synchrony binding robot date: 09-13-2020 02:00 gmt revision:0 [head]

PMID-15142952 Visual binding through reentrant connectivity and dynamic synchronization in a brain-based device

  • Controlled a robot with a complete (for the time) model of the occipital-inferotemporal visual pathway (V1 V2 V4 IT), auditory cortex, colliculus, 'value cortex'.
  • Synapses had a timing-dependent assoicative BCM learning rule
  • Robot had reflexes to orient toward preferred auditory stimuli
  • Subsequently, robot 'learned' to orient toward a preferred stimuli (e.g. one that caused orientation).
  • Visual stimuli were either diamonds or squares, either red or green.
    • Discrimination task could have been carried out by (it seems) one perceptron layer.
  • This was 16 years ago, and the results look quaint compared to the modern deep-learning revolution. That said, 'the binding problem' is imho still outstanding or at least interesting. Actual human perception is far more compositional than a deep CNN can support.

hide / / print
ref: -2020 tags: Neuralink commentary BMI pigs date: 08-31-2020 18:01 gmt revision:1 [0] [head]

Neuralink progress update August 28 2020

Some commentary.

The good:

  • Ian hit the nail on the head @ 1:05:47. That is not a side-benefit -- that was the original and true purpose. Thank you.
  • The electronics, amplify / record / sort / stim ASIC, as well as interconnect all advance the state of the art in density, power efficiency, and capability. (I always liked higher sampling rates, but w/e)
  • Puck is an ideal form factor, again SOTA. 25mm diameter craniotomy should give plenty of space for 32 x 32-channel depth electrodes (say).
  • I would estimate that the high-density per electrode feed-through is also SOTA, but it might also be a non-hermetic pass-through via the thin-film (e.g. some water vapor diffusion along the length of the polyimide (if that polymer is being used)).
  • Robot looks nice dressed in those fancy robes. Also looks like there is a revolute joint along the coronal axis.
  • Stim on every channel is cool.
  • Pigs seem like an ethical substitute for monkeys.

The mixed:

  • Neurons are not wires.
  • $2000 outpatient neurosurgery?! Will need to address the ~3% complication rate for most neurosurgery.
  • Where is the monkey data? Does it not work in monkeys? Insufficient longevity or yield? Was it strategic to not mention any monkeys, to avoid bad PR or the wrath of PETA?
    • I can't imagine getting into humans without demonstrating both safety and effectiveness on monkeys. Pigs are fine for the safety part, but monkeys are the present standard for efficacy.
  • How long do the electrodes last in pigs? What is the recording quality? How stable are the traces?
    • Judging from the commentary, assume this is a electrode material problem? What does Neuralink do if they are not significantly different in yield and longevity than the Utah array? (The other problems might well be easier than this one.)
      • That said, a thousand channels of EMG should be sufficient for some of the intended applications (below).
    • It really remains to be seen how well the brain tolerates these somewhat-large somewhat-thin electrodes, what percentage of the brain is disrupted in the process of insertion, and how much of the disruption is transient / how much is irrecoverable.
    • Pig-snout somatosensory cortex is an unusual recording location, making comparison difficult, but what was shown seemed rather correlated (?) We'd have to read an actual scientific publication to evaluate.
  • This slide is deceptive, as not all the applications are equally .. applicable. You don't need an extracellular ephys device to solve these problems that "almost everyone" will encounter over the course of their lives.
    • Memory loss -- Probably better dealt with via cellular / biological therapies, or treating the causes (stroke, infection, inflammation, neuroendocrine or neuromodulatory disregulation)
    • Hearing loss -- Reasonable. Nice complement to improved cochlear implants too. (Maybe the Neuralink ASIC could be used for that, too).
      • With this and the other reasonable applications, best to keep in context that stereo EEG, which is fairly disruptive w/ large probes, is well tolerated in epilepsy patients. (It has unclear effect on IQ or memory, but still, the sewing machine should be less invasive.)
    • Blindness -- Reasonable. Mating the puck to a Second Sight style thin film would improve channel count dramatically, and be less invasive. Otherwise you have to sew into the calcarine fissure, destroying a fair bit of cortex in the process & possibly hitting an artery or sulcal vein.
    • Paralysis -- Absolutely. This application is well demonstrated, and the Neuralink device should be able to help SCI patients. Presumably this will occupy them for the next five years; other applications would be a distraction.
      • Being able to sew flexible electrodes into the spinal cord is a great application.
    • Depression -- Need deeper targets for this. Research to treat depression via basal ganglia stim is ongoing; no reason it could not be mated to the Neuralink puck + long electrodes.
    • Insomina -- I guess?
    • Extreme pain -- Simpler approaches are likely better, but sure?
    • Seizures -- Yes, but note that Neuropace burned through $250M and wasn't significantly better than sham surgery. Again, likely better dealt with biologically: recombinant ion channels, glial or interneuron stem cell therapy.
    • Anxiety -- maybe? Designer drugs seem safer. Or drugs + CBT. Elon likes root causes: spotlight on the structural ills of our society.
    • Addiction -- Yes. It seems possible to rewire the brain with the right record / stim strategy, via for example a combination of DBS and cortical recording. Social restructuring is again a better root-cause fix.
    • Strokes -- No, despite best efforts, the robot causes (small) strokes.
    • Brain Damage -- Insertion of electrodes causes brain damage. Again, better dealt with via cellular (e.g. stem cells) or biological approaches.
      • This, of course, will take time as our understanding of brain development is limited; the good thing is that sufficient guidance signals remain in the adult brain, so AFAIK it's possible. From his comments, seems Alan's attitude is more aligned with this.
    • Not really bad per-se, but right panel could be better. I assume this was a design decision trade-off between working distance, NA, illumination, and mechanical constraints.
    • Despite Elon's claims, there is always bleeding when you poke electrodes that large into the cortex; the capillary bed is too dense. Let's assume Elon meant 'macro' bleeding, which is true. At least the robot avoids visible vessels.
    • Predicting joint angles for cyclical behavior is not challenging; can be done with EMG or microphonic noise correlated to some part of the gait. Hence the request for monkey BMI data.
  • Given the risk, pretty much any of the "sci-fi" applications mentioned in response to dorky twitter comments can be better provided to neurologically normal people through electronics, without the risk of brain surgery.
  • Regarding sci-fi application linguistic telepathy:
    • First, agreed, clarifying thoughts into language takes effort. This is a mostly unavoidable and largely good task. Interfacing with the external world is a vital part of cognition; shortcutting it, in my estimation, will just lead to sloppy & half-formed ideas not worth communicating. The compression of thoughts into words (as lossy as it may be) is the primary way to make them discrete enough to be meaningful to both other people and yourself.
    • Secondly: speech (or again any of the many other forms of communication) is not that much slower than cognition. If it was, we'd have much larger vocabularies, much more complicated and meaning-conveying grammar, etc (Like Latin?). The limit is the average persons cognition and memory. I disagree with Elon's conceit.
  • Regarding visual telepathy, with sufficient recording capabilities, I see no reason why you couldn't have a video-out port on the brain. Difficult given the currently mostly unknown representation of higher-level visual cortices, but as Ian says, once you have a good oscilloscope, this can be deduced.
  • Regarding AI symbiosis @1:09:19; this logic is not entirely clear to me. AI is a tool that will automate & facilitate the production and translation of knowledge much the same way electricity etc automated & facilitated the production and transportation of physical goods. We will necessarily need to interface with it, but to the point that we are thoroughly modifying our own development & biology, those interfaces will likely be based on presently extant computer interfaces.
    • If we do start modifying the biological wiring structure of our brains, I can't imagine that there will many limits! (Outside hard metabolic limits that brain vasculature takes pains to allocate and optimize.)
    • So, I guess the central tenet might be vaguely ok if you allow that humans are presently symbiotic with cell phones. (A more realistic interpretation is that cell phones are tools, and maybe Google etc are the symbionts / parasites). This is arguably contributing to current political existential crises -- no need to look further. If you do look further, it's not clear that stabbing the brains of healthy individuals will help.
    • I find the MC to be slightly unctuous and ingratiating in a way appropriate for a video game company, but not for a medical device company. That, of course, is a judgement call & matter of taste. Yet, as this was partly a recruiting event ... you will find who you set the table for.

hide / / print
ref: -0 tags: synaptic plasticity 2-photon imaging inhibition excitation spines dendrites synapses 2p date: 08-14-2020 01:35 gmt revision:3 [2] [1] [0] [head]

PMID-22542188 Clustered dynamics of inhibitory synapses and dendritic spines in the adult neocortex.

  • Cre-recombinase-dependent labeling of postsynapitc scaffolding via Gephryn-Teal fluorophore fusion.
  • Also added Cre-eYFP to label the neurons
  • Electroporated in utero e16 mice.
    • Low concentration of Cre, high concentrations of Gephryn-Teal and Cre-eYFP constructs to attain sparse labeling.
  • Located the same dendrite imaged in-vivo in fixed tissue - !! - using serial-section electron microscopy.
  • 2230 dendritic spines and 1211 inhibitory synapses from 83 dendritic segments in 14 cells of 6 animals.
  • Some spines had inhibitory synapses on them -- 0.7 / 10um, vs 4.4 / 10um dendrite for excitatory spines. ~ 1.7 inhibitory
  • Suggest that the data support the idea that inhibitory inputs maybe gating excitation.
  • Furthermore, co-inervated spines are stable, both during mormal experience and during monocular deprivation.
  • Monocular deprivation induces a pronounced loss of inhibitory synapses in binocular cortex.

hide / / print
ref: -2013 tags: 2p two photon STED super resolution microscope synapse synaptic plasticity date: 08-14-2020 01:34 gmt revision:3 [2] [1] [0] [head]

PMID-23442956 Two-Photon Excitation STED Microscopy in Two Colors in Acute Brain Slices

  • Plenty of details on how they set up the microscope.
  • Mice: Thy1-eYFP (some excitatory cells in the hippocampus and cortex) and CX3CR1-eGFP (GFP in microglia). Crossbred the two strains for two-color imaging.
  • Animals were 21-40 days old at slicing.

PMID-29932052 Chronic 2P-STED imaging reveals high turnover of spines in the hippocampus in vivo

  • As above, Thy1-GFP / Thy1-YFP labeling; hence this was a structural study (for which the high resolution of STED was necessary).
  • Might just as well gone with synaptic labels, e.g. tdTomato-Synapsin.

hide / / print
ref: -0 tags: synaptic plasticity LTP LTD synapses NMDA glutamate uncaging date: 08-11-2020 22:40 gmt revision:0 [head]

PMID-31780899 Single Synapse LTP: A matter of context?

  • Not a great name for a thorough and reasonably well-written review of glutamate uncaging studies as related to LTP (and to a lesser extent LTD).
  • Lots of refernces from many familiar names. Nice to have them all in one place!
  • I'm left wondering, between CaMKII, PKA, PKC, Ras, other GTP dependent molecules -- how much of the regulatory network in synapse is known? E.g. if you pull down all proteins in the synaptosome & their interacting partners, how many are unknown, or have an unknown function? I know something like this has been done for flies, but in mammals - ?

hide / / print
ref: -0 tags: GEVI review voltage sensor date: 08-10-2020 22:22 gmt revision:24 [23] [22] [21] [20] [19] [18] [head]

Various GEVIs invented and evolved:

Ace-FRET sensors

  • PMID-26586188 Ace-mNeonGreen, an opsin-FRET sensor, might still be better in terms of SNR, but it's green.
    • Negative ΔF/F\Delta F / F with depolarization.
    • Fast enough to resolve spikes.
    • Rational design; little or no screening.
    • Ace is about six times as fast as Mac, and mNeonGreen has a ~50% higher extinction coefficient than mCitrine and nearly threefold better photostability (12)

  • PMID-31685893 A High-speed, red fluorescent voltage sensor to detect neural activity
    • Fusion of Ace2N + short linker + mScarlet, a bright (if not the brightest; highest QY) monomeric red fluorescent protein.
    • Almost as good SNR as Ace2N-mNeonGreen.
    • Also a FRET sensor; negative delta F with depolarization.
    • Ace2N-mNeon is not sensitive under two-photon illumination; presumably this is true of all eFRET sensors?
    • Ace2N drives almost no photocurrent.
    • Sought to maximize SNR: dF/F_0 X sqrt(F_0); screened 'only' 18 linkers to see what worked the best. Yet - it's better than VARNAM.
    • ~ 14% dF/F per 100mV depolarization.

Arch and Mac rhodopsin sensors

  • PMID-22120467 Optical recording of action potentials in mammalian neurons using a microbial rhodopsin Arch 2011
    • Endogenous fluorescence of the retinal (+ environment) of microbial rhodopsin protein Archaerhodopsin 3 (Arch) from Halorubrum sodomense.
    • Proton pump without proton pumping capabilities also showed voltage dependence, but slower kinetics.
      • This required one mutation, D95N.
    • Requires fairly intense illumination, as the QY of the fluorophore is low (9 x 10-4). Still, photobleaching rate was relatively low.
    • Arch is mainly used for neuronal inhibition.

  • PMID-25222271 Archaerhodopsin Variants with Enhanced Voltage Sensitive Fluorescence in Mammalian and Caenorhabditis elegans Neurons Archer1 2014
    • Capable of voltage sensing under red light, and inhibition (via proton pumping) under green light.
    • Note The high laser power used to excite Arch (above) fluorescence causes significant autofluorescence in intact tissue and limits its accessibility for widespread use.
    • Archers have 3-5x the fluorescence of WT Arch -- so, QY of ~3.6e-3. Still very dim.
    • Archer1 dF/F_0 85%; Archer2 dF/F_0 60% @ 100mV depolarization (positive sense).
    • Screened the proton pump of Gloeobacter violaceus rhodopsin; found mutations were then transferred to Arch.
      • Maybe they were planning on using the Geobacter rhodopsin, but it didn't work for some reason, so they transferred to Arch..
    • TS and ER export domains for localization.

  • PMID-24755708 Imaging neural spiking in brain tissue using FRET-opsin protein voltage sensors MacQ-mOrange and MacQ-mCitrine.
    • L. maculans (Mac) rhodopsin (faster than Arch) + FP mCitrine, FRET sensor + ER/TS.
    • Four-fold faster kinetics and 2-4x brighter than ArcLight.
      • No directed evolution to optimize sensitivity or brightness. Just kept the linker short & trimmed residues based on crystal structure.
    • ~5% delta F/F, can resolve spikes up to 10Hz.
    • Spectroscopic studies of the proton pumping photocycle in bacteriorhodopsin and Archaerhodopsin (Arch) have revealed that proton translocation through the retinal Schiff base changes chromophore absorption [24-26]
    • Used rational design to abolish the proton current (D139N and D139Q aka MacQ) ; screens to adjust the voltage sensing kinetics.
    • Still has photocurrents.
    • Seems that slice / in vivo is consistently worse than cultured neurons... in purkinje neurons, dF/F 1.2%, even though in vitro response was ~ 15% to a 100mV depolarization.
    • Imaging intensity 30mw/mm^2. (3W/cm^2)

  • PMID-24952910 All-optical electrophysiology in mammalian neurons using engineered microbial rhodopsins QuasAr1 and QuasAr1 2014
    • Directed evolution approach to improve the brightness and speed of Arch D95N.
      • Improved the fluorescence QY by 19 and 10x. (1 and 2, respectively -- Quasar2 has higher sensitivity).
    • also developed a low-intensity channelrhodopsin, Cheriff, which can be activated by blue light (lambda max = 460 nm)dim enough to not affect QuasAr.
    • They call the two of them 'Optopatch 2'.
    • Incident light intensity 1kW / cm^2 (!)

  • PMID-29483642 A robotic multidimensional directed evolution approach applied to fluorescent voltage reporters. Archon1 2018
    • Started with QuasAr2 (above), which was evolved from Arch. Intrinsic fluorescence of retinal in rhodopsin.
    • Expressed in HEK293T cells; then FACS, robotic cell picking, whole genome amplification, PCR, cloning.
    • Also evolved miRFP, deep red fluorescent protein based on bacteriophytochrome.
    • delta F/F of 80 and 20% with a 100mV depolarization.
    • We investigated the contribution of specific point mutations to changes in localization, brightness, voltage sensitivity and kinetics and found the patterns that emerged to be complex (Supplementary Table 6), with a given mutation often improving one parameter but worsening another.
    • If the original QY of Arch was 9e-4, and Quasar2 improved this by 10, and Archon1 improved this by 2.3x, then the QY of Archon1 is 0.02. Given the molar extinction coefficient is ~ 50000 for retinal, this means the brightness of the fluorescent probe is low, 1. (good fluorescent proteins and synthetic dyes have a brightness of ~90).
  • Imaged using 637nm laser light at 800mW/mm2 for Archon1 and Archon2; emission filtered through 664LP

VSD - FP sensors

  • PMID-28811673 Improving a genetically encoded voltage indicator by modifying the cytoplasmic charge composition Bongwoori 2017
    • ArcLight derivative.
    • Arginine (positive charge) scanning mutagenesis of the linker region improved the signal size of the GEVI, Bongwoori, yielding fluorescent signals as high as 20% ΔF/F during the firing of action potentials.
    • Used the mutagenesis to shift the threshold for fluorescence change more negative, ~ -30mV.
    • Like ArcLight, it's slow.
    • Strong baseline shift due to the acidification of the neuron during AP firing (!)

  • Attenuation of synaptic potentials in dentritic spines
    • Found that SNR / dF / F_0 is limited by intracellular localization of the sensor.
      • This is true even though ArcLight is supposed to be in a dark state in the lower pH of intracellular organelles.. a problem worth considering.
      • Makes negative-going GEVI's more practical, as those not in the membrane are dark @ 0mV.

  • Fast two-photon volumetric imaging of an improved voltage indicator reveals electrical activity in deeply located neurons in the awake brain ASAP3 2018
    • Opsin-based GEVIs have been used in vivo with 1p excitation to report electrical activity of superficial neurons, but their responsivity is attenuated for 2p excitation. (!)
    • Site-directed evolution in HEK cells.
    • Expressed linear PCR products directly in the HEK cells, with no assembly / ligation required! (Saves lots of time: normally need to amplify, assemble into a plasmid, transfect, culture, measure, purify the plasimd, digest, EP PCR, etc).
    • Screened in a motorized 384-well conductive plate, electroporation electrode sequentially activates each on an upright microscope.
    • 46% improvement over ASAP2 R414Q
    • Ace2N-4aa-mNeon is not responsive under 2p illum; nor is Archon1 or Quasar2/3
    • ULOVE = AOD based fast local scanning 2-p random access scope.

  • Bright and tunable far-red chemigenetic indicators
    • GgVSD (same as ASAP above) + cp HaloTag + Si-Rhodamine JF635
    • ~ 4% dF/F_0 during APs.
    • Found one mutation, R476G in the linker between cp Halotag and S4 of the VSD, which doubled the sensitivity of HASAP.
    • Also tested a ArcLight type structure, CiVSD fused to Halotag.
      • HarcLght had negative dF/F_0 and ~ 3% change in response to APs.
    • No voltage sensitivity when the synthetic dye was largely in the zwitterionic form, eg. tetramethylrodamine.

hide / / print
ref: -2015 tags: spiking neural networks causality inference demixing date: 07-22-2020 18:13 gmt revision:1 [0] [head]

PMID-26621426 Causal Inference and Explaining Away in a Spiking Network

  • Rubén Moreno-Bote & Jan Drugowitsch
  • Use linear non-negative mixing plus nose to generate a series of sensory stimuli.
  • Pass these through a one-layer spiking or non-spiking neural network with adaptive global inhibition and adaptive reset voltage to solve this quadratic programming problem with non-negative constraints.
  • N causes, one observation: μ=Σ i=1 Nu ir i+ε \mu = \Sigma_{i=1}^{N} u_i r_i + \epsilon ,
    • r i0r_i \geq 0 -- causes can be present or not present, but not negative.
    • cause coefficients drawn from a truncated (positive only) Gaussian.
  • linear spiking network with symmetric weight matrix J=U TUβI J = -U^TU - \beta I (see figure above)
    • That is ... J looks like a correlation matrix!
    • UU is M x N; columns are the mixing vectors.
    • U is known beforehand and not learned
      • That said, as a quasi-correlation matrix, it might not be so hard to learn. See ref [44].
  • Can solve this problem by minimizing the negative log-posterior function: $$ L(\mu, r) = \frac{1}{2}(\mu - Ur)^T(\mu - Ur) + \alpha1^Tr + \frac{\beta}{2}r^Tr $$
    • That is, want to maximize the joint probability of the data and observations given the probabilistic model p(μ,r)exp(L(μ,r))Π i=1 NH(r i) p(\mu, r) \propto exp(-L(\mu, r)) \Pi_{i=1}^{N} H(r_i)
    • First term quadratically penalizes difference between prediction and measurement.
    • second term, alpha is a L1 regularization term, and third term w beta is a L2 regularization.
  • The negative log-likelihood is then converted to an energy function (linear algebra): W=U TUW = -U^T U , h=U Tμ h = U^T \mu then E(r)=0.5r TWrr Th+α1 Tr+0.5βr TrE(r) = 0.5 r^T W r - r^T h + \alpha 1^T r + 0.5 \beta r^T r
    • This is where they get the weight matrix J or W. If the vectors U are linearly independent, then it is negative semidefinite.
  • The dynamics of individual neurons w/ global inhibition and variable reset voltage serves to minimize this energy -- hence, solve the problem. (They gloss over this derivation in the main text).
  • Next, show that a spike-based network can similarly 'relax' or descent the objective gradient to arrive at the quadratic programming solution.
    • Network is N leaky integrate and fire neurons, with variable synaptic integration kernels.
    • α\alpha translates then to global inhibition, and β\beta to lowered reset voltage.
  • Yes, it can solve the problem .. and do so in the presence of firing noise in a finite period of time .. but a little bit meh, because the problem is not that hard, and there is no learning in the network.

hide / / print
ref: -2017 tags: GraphSAGE graph neural network GNN date: 07-16-2020 15:49 gmt revision:2 [1] [0] [head]

Inductive representation learning on large graphs

  • William L. Hamilton, Rex Ying, Jure Leskovec
  • Problem: given a graph where each node has a set of (possibly varied) attributes, create a 'embedding' vector at each node that describes both the node and the network that surrounds it.
  • To this point (2017) there were two ways of doing this -- through matrix factorization methods, and through graph convolutional networks.
    • The matrix factorization methods or spectral methods (similar to multi-dimensional scaling, where points are projected onto a plane to preserve a distance metric) are transductive : they work entirely within-data, and don't directly generalize to new data.
      • This is parsimonious in some sense, but doesn't work well in the real world, where datasets are constantly changing and frequently growing.
  • Their approach is similar to graph convolutional networks, where (I think) the convolution is indexed by node distances.
  • General idea: each node starts out with an embedding vector = its attribute or feature vector.
  • Then, all neighboring nodes are aggregated by sampling a fixed number of the nearest neighbors (fixed for computational reasons).
    • Aggregation can be mean aggregation, LSTM aggregation (on random permuations of the neighbor nodes), or MLP -> nonlinearity -> max-pooling. Pooling has the most wins, though all seem to work...
  • The aggregated vector is concatenated with the current node feature vector, and this is fed through a learned weighting matrix and nonlinearity to output the feature vector for the current pass.
  • Passes proceed from out-in... i think.
  • Algorithm is inspired by the Weisfeiler-Lehman Isomorphism Test, which updates neighbor counts per node to estimate if graphs are isomorphic. They do a similar thing here, only with vectors not scalars, and similarly take into account the local graph structure.
    • All the aggregator functions, and for course the nonlinearities and weighting matricies, are differentiable -- so the structure is trained in a supervised way with SGD.

This is a well-put together paper, with some proofs of convergence etc -- but it still feels only lightly tested. As with many of these papers, could benefit from a positive control, where the generating function is known & you can see how well the algorithm discovers it.

Otherwise, the structure / algorithm feels rather intuitive; surprising to me that it was not developed before the matrix factorization methods.

Worth comparing this to word2vec embeddings, where local words are used to predict the current word & the resulting vector in the neck-down of the NN is the representation.

hide / / print
ref: -0 tags: bleaching STED dye phosphorus japan date: 07-16-2020 14:06 gmt revision:1 [0] [head]

Super-Photostable Phosphole-Based Dye for Multiple-Acquisition Stimulated Emission Depletion Imaging

  • Use the electron withdrawing ability of a phosphole group (P = O) to reduce photobleaching
  • Derived from another photostable dye, C-Naphox, only with a different mechanism of fluorescence -- pi-pi* transfer rather than intramolecular charge transfer (ICT).
  • Much more stable than Alexa 488 (aka sulfonated fluorescein, which is not the most stable dye..)
  • Suitable for multiple STED images, unlike the other dyes. (Note!)

hide / / print
ref: -0 tags: delete date: 06-10-2020 17:05 gmt revision:5 [4] [3] [2] [1] [0] [head]

Test equation:

fitness=(21+e K on/5001)*(21+e ΔFF/301) fitness = (\frac{2 }{1+e^{K_on / -500}} - 1) * (\frac{2 }{1+e^{\Delta FF / -30}} -1)

hide / / print
ref: -0 tags: constitutional law supreme court date: 06-03-2020 01:40 gmt revision:0 [head]

Spent a while this evening reading about Qualified Immunity -- the law that permits government officials (e.g. police officers) immunity when 'doing their jobs'. It's perhaps one root of the George Floyd / racism protests, as it has set a precedent that US police can be violent and get away with it. (This is also related to police unions and collective liability loops... anyway)

The supreme court has the option to take cases challenging the constitutionality of Qualified Immunity, which many on both sides of the political spectrum want them to do.

It 'got' this power via Marbury vs. Madison. M v. M is self-referential genius:

  • They ruled the original action (blocking an appointment) was illegal
  • but the court does not have the power to make these decisions
  • because the congressional law that gave the Supreme Court that power was unconstitutional.
  • Instead, the supreme court has the power to decide if laws (in this case, those governing its jurisdiction) are constitutional.
  • E.g. SCOTUS initiated judicial review & expansion of it's jurisdiction over Congressional law by repealing a law that expanded it's jurisdiction by congress.
  • This was also done while threading the loops to satisfy then-present political pressure (who wanted the original appointment to be illegal) so that they (Thomas Jefferson) were aligned with the increase in power, so the precedent could persist.

As a person curious how systems gain complexity and feedback loops ... so much nerdgasm.

hide / / print
ref: -0 tags: rutherford journal computational theory neumann complexity wolfram date: 05-05-2020 18:15 gmt revision:0 [head]

The Structures for Computation and the Mathematical Structure of Nature

  • Broad, long, historical.

hide / / print
ref: -2017 tags: google deepmind compositional variational autoencoder date: 04-08-2020 01:16 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

SCAN: learning hierarchical compositional concepts

  • From DeepMind, first version Jul 2017 / v3 June 2018.
  • Starts broad and strong:
    • "The seemingly infinite diversity of the natural world from a relatively small set of coherent rules"
      • Relative to what? What's the order of magnitude here? In personal experience, each domain involves a large pile of relevant details..
    • "We conjecture that these rules dive rise to regularities that can be discovered through primarily unsupervised experiences and represented as abstract concepts"
    • "If such representations are compositional and hierarchical, they can be recombined into an exponentially large set of new concepts."
    • "Compositionality is at the core of such human abilities as creativity, imagination, and language-based communication.
    • This addresses the limitations of deep learning, which are overly data hungry (low sample efficiency), tend to overfit the data, and require human supervision.
  • Approach:
    • Factorize the visual world with a Β\Beta -VAE to learn a set of representational primitives through unsupervised exposure to visual data.
    • Expose SCAN (or rather, a module of it) to a small number of symbol-image pairs, from which the algorithm identifies the set if visual primitives (features from beta-VAE) that the examples have in common.
      • E.g. this is purely associative learning, with a finite one-layer association matrix.
    • Test on both image 2 symbols and symbols to image directions. For the latter, allow irrelevant attributes to be filled in from the priors (this is important later in the paper..)
    • Add in a third module, which allows learning of compositions of the features, ala set notation: AND ( \cup ), IN-COMMON ( \cap ) & IGNORE ( \setminus or '-'). This is via a low-parameter convolutional model.
  • Notation:
    • q ϕ(z x|x)q_{\phi}(z_x|x) is the encoder model. ϕ\phi are the encoder parameters, xx is the visual input, z xz_x are the latent parameters inferred from the scene.
    • p theta(x|z x)p_{theta}(x|z_x) is the decoder model. xp θ(x|z x)x \propto p_{\theta}(x|z_x) , θ\theta are the decoder parameters. xx is now the reconstructed scene.
  • From this, the loss function of the beta-VAE is:
    • 𝕃(θ,ϕ;x,z x,β)=𝔼 q ϕ(z x|x)[logp θ(x|z x)]βD KL(q ϕ(z x|x)||p(z x)) \mathbb{L}(\theta, \phi; x, z_x, \beta) = \mathbb{E}_{q_{\phi}(z_x|x)} [log p_{\theta}(x|z_x)] - \beta D_{KL} (q_{\phi}(z_x|x)|| p(z_x)) where Β>1\Beta \gt 1
      • That is, maximize the auto-encoder fit (the expectation of the decoder, over the encoder output -- aka the pixel log-likelihood) minus the KL divergence between the encoder distribution and p(z x)p(z_x)
        • p(z)𝒩(0,I)p(z) \propto \mathcal{N}(0, I) -- diagonal normal matrix.
        • β\beta comes from the Lagrangian solution to the constrained optimization problem:
        • max ϕ,θ𝔼 xD[𝔼 q ϕ(z|x)[logp θ(x|z)]]\max_{\phi,\theta} \mathbb{E}_{x \sim D} [\mathbb{E}_{q_{\phi}(z|x)}[log p_{\theta}(x|z)]] subject to D KL(q ϕ(z|x)||p(z))<εD_{KL}(q_{\phi}(z|x)||p(z)) \lt \epsilon where D is the domain of images etc.
      • Claim that this loss function tips the scale too far away from accurate reconstruction with sufficient visual de-tangling (that is: if significant features correspond to small details in pixel space, they are likely to be ignored); instead they adopt the approach of the denoising auto-encoder ref, which uses the feature L2 norm instead of the pixel log-likelihood:
    • 𝕃(θ,ϕ;X,z x,β)=𝔼 q ϕ(z x|x)||J(x^)J(x)|| 2 2βD KL(q ϕ(z x|x)||p(z x)) \mathbb{L}(\theta, \phi; X, z_x, \beta) = -\mathbb{E}_{q_{\phi}(z_x|x)}||J(\hat{x}) - J(x)||_2^2 - \beta D_{KL} (q_{\phi}(z_x|x)|| p(z_x)) where J: WxHxC NJ : \mathbb{R}^{W x H x C} \rightarrow \mathbb{R}^N maps from images to high-level features.
      • This J(x)J(x) is from another neural network (transfer learning) which learns features beforehand.
      • It's a multilayer perceptron denoising autoencoder [Vincent 2010].
  • The SCAN architecture includes an additional element, another VAE which is trained simultaneously on the labeled inputs yy and the latent outputs from encoder z xz_x given xx .
  • In this way, they can present a description yy to the network, which is then recomposed into z yz_y , that then produces an image x^\hat{x} .
    • The whole network is trained by minimizing:
    • 𝕃 y(θ y,ϕ y;y,x,z y,β,λ)=1 st2 nd3 rd \mathbb{L}_y(\theta_y, \phi_y; y, x, z_y, \beta, \lambda) = 1^{st} - 2^{nd} - 3^{rd}
      • 1st term: 𝔼 q ϕ y(z y|y)[logp θ y(y|z y)] \mathbb{E}_{q_{\phi_y}(z_y|y)}[log p_{\theta_y} (y|z_y)] log-likelihood of the decoded symbols given encoded latents z yz_y
      • 2nd term: βD KL(q ϕ y(z y|y)||p(z y)) \beta D_{KL}(q_{\phi_y}(z_y|y) || p(z_y)) weighted KL divergence between encoded latents and diagonal normal prior.
      • 3rd term: λD KL(q ϕ x(z x|y)||q ϕ y(z y|y))\lambda D_{KL}(q_{\phi_x}(z_x|y) || q_{\phi_y}(z_y|y)) weighted KL divergence between latents from the images and latents from the description yy .
        • They note that the direction of the divergence matters; I suspect it took some experimentation to see what's right.
  • Final element! A convolutional recombination element, implemented as a tensor product between z y1z_{y1} and z y2z_{y2} that outputs a one-hot encoding of set-operation that's fed to a (hardcoded?) transformation matrix.
    • I don't think this is great shakes. Could have done this with a small function; no need for a neural network.
    • Trained with very similar loss function as SCAN or the beta-VAE.

  • Testing:
  • They seem to have used a very limited subset of "DeepMind Lab" -- all of the concept or class labels could have been implimented easily, e.g. single pixel detector for the wall color. Quite disappointing.
  • This is marginally more interesting -- the network learns to eliminate latent factors as it's exposed to examples (just like perhaps a Bayesian network.)
  • Similarly, the CelebA tests are meh ... not a clear improvement over the existing VAEs.

hide / / print
ref: -2020 tags: evolution neutral drift networks random walk entropy population date: 04-08-2020 00:48 gmt revision:0 [head]

Localization of neutral evolution: selection for mutational robustness and the maximal entropy random walk

  • The take-away of the paper is that, with larger populations, random mutation and recombination make areas of the graph that take several steps to get to (in the figure, this is Maynard Smith's four-letter mutation word game) are less likely to be visited with a larger population.
  • This is because the recombination serves to make the population adhere more closely to the 'giant' mode. In Maynard's game, this is 2268 words of 2405 meaningful words that can be reached by successive letter changes.
  • The author extends it to van Nimwegen's 1999 paper / RNA genotype-secondary structure. It's not as bad as Maynard's game, but still has much lower graph-theoretic entropy than the actual population.
    • He suggests if the entropic size of the giant component is much smaller than it's dictionary size, then populations are likely to be trapped there.

  • Interesting, but I'd prefer to have an expert peer-review it first :)

hide / / print
ref: -0 tags: asymmetric locality sensitive hash maximum inner product search sparsity date: 03-30-2020 02:17 gmt revision:5 [4] [3] [2] [1] [0] [head]

Improved asymmetric locality sensitive hashing for maximum inner product search

  • Like many other papers, this one is based on a long lineage of locality-sensitive hashing papers.
  • Key innovation, in [23] The power of asymmetry in binary hashing, was the development of asymmetric hashing -- the hash function of the query is different than the hash function used for storage. Roughly, this allows additional degrees of freedom since the similarity-function is (in the non-normalized case) non-symmetric.
    • For example, take query Q = [1 1] with keys A = [1 -1] and B = [3 3]. The nearest neighbor is A (distance 2), whereas the maximum inner product is B (inner product 6).
    • Alternately: self-inner product for Q and A is 2, whereas for B it's 18. Self-similarity is not the highest with inner products.
    • Norm of the query does not have an effect on the arg max of the search, though. Hence, for the paper assume that the query has been normalized for MIPS.
  • In this paper instead they convert MIPS into approximate cosine similarity search (which is like normalized MIPS), which can be efficiently solved with signed random projections.
  • (Established): LSH-L2 distance:
    • Sample a random vector a, iid normal N(0,1)
    • Sample a random normal b between 0 and r
      • r is the window size / radius (free parameters?)
    • Hash function is then the floor of the inner product of the vector a and input x + b divided by the radius.
      • I'm not sure about how the floor op is converted to bits of the actual hash -- ?
  • (Established): LSH-correlation, signed random projections h signh^{sign} :
    • Hash is the sign of the inner product of the input vector and a uniform random vector a.
    • This is a two-bit random projection [13][14].
  • (New) Asymmetric-LSH-L2:
    • P(x)=[x;||x|| 2 2;||x|| 2 4;....;||x|| 2 2 m]P(x) = [x;||x||^2_2; ||x||^4_2; .... ; ||x||^{2^m}_2] -- this is the pre-processing hashing of the 'keys'.
      • Requires that then norm of these keys, {||x||}_2 < U < 1$$
      • m3 m \geq 3
    • Q(x)=[x;1/2;1/2;...;1/2]Q(x) = [x;1/2; 1/2; ... ; 1/2] -- hashing of the queries.
    • See the mathematical explanation in the paper, but roughly "transformations P and Q, when normas are less than 1, provide correction to the L2 distance ||Q(p)P(x i)|| 2||Q(p) - P(x_i)||_2 , making in rank correlate with un-normalized inner product."
  • They then change the augmentation to:
    • P(x)=[x;1/2||x|| 2 2;1/2||x|| 2 4;...;1/2||x|| 2 2 m]P(x) = [x; 1/2 - ||x||^2_2; 1/2 - ||x||^4_2; ... ; 1/2 - ||x||^{2^m}_2]
    • Q(x)=[x;0;...;0]Q(x) = [x; 0; ...; 0]
    • This allows use of signed nearest-neighbor search to be used in the MIPS problem. (e.g. the hash is the sign of P and Q, per above; I assume this is still a 2-bit operation?)
  • Then the expand the U,M compromise function ρ\rho to allow for non-normalized queries. U depends on m and c (m is the codeword extension, and c is the ratio between o-target and off-target hash hits.
  • Tested on Movielens and Netflix databases, this using SVD preprocessing on the user-item matrix (full rank matrix indicating every user rating on every movie (mostly zeros!)) to get at the latent vectors.
  • In the above plots, recall (hah) that precision is the number of true positives / number of false positives as the number of draws k increases; recall is the number of true positives / number of draws k.
    • Clearly, the curve bends up and to the right when there are a lot of hash tables K.
    • Example datapoint: 50% precision at 40% recall, top 5. So on average you get 2 correct hits in 4 draws. Or: 40% precision, 20% recall, top 10: 2 hits in 5 draws. 20/40: 4 hits in 20 draws. (hit: correctly within the top-N)
    • So ... it's not that great.

Use case: Capsule: a camera based positioning system using learning
  • Uses 512 SIFT features as keys and queries to LSH. Hashing is computed via sparse addition / subtraction algorithm, with K bits per hash table (not quite random projections) and L hash tables. K = 22 and L = 24. ~ 1000 training images.
  • Best matching image is used as the location of the current image.

hide / / print
ref: -0 tags: reinforcement learning distribution DQN Deepmind dopamine date: 03-30-2020 02:14 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-31942076 A distributional code for value in dopamine based reinforcement learning

  • Synopsis is staggeringly simple: dopamine neurons encode / learn to encode a distribution of reward expectations, not just the mean (aka the expected value) of the reward at a given state-action pair.
  • This is almost obvious neurally -- of course dopamine neurons in the striatum represent different levels of reward expectation; there is population diversity in nearly everything in neuroscience. The new interpretation is that neurons have different slopes for their susceptibility to positive and negative rewards (or rather, reward predictions), which results in different inflection points where the neurons are neutral about a reward.
    • This constitutes more optimistic and pessimistic neurons.
  • There is already substantial evidence that such a distributional representation enhances performance in DQN (Deep q-networks) from circa 2017; the innovation here is that it has been extended to experiments from 2015 where mice learned to anticipate water rewards with varying volume, or varying probability of arrival.
  • The model predicts a diversity of asymmetry below and above the reversal point
  • Also predicts that the distribution of reward responses should be decoded by neural activity ... which it is ... but it is not surprising that a bespoke decoder can find this information in the neural firing rates. (Have not examined in depth the decoding methods)
  • Still, this is a clear and well-written, well-thought out paper; glad to see new parsimonious theories about dopamine out there.

hide / / print
ref: -2016 tags: locality sensitive hash deep learning regularization date: 03-30-2020 02:07 gmt revision:5 [4] [3] [2] [1] [0] [head]

Scalable and sustainable deep learning via randomized hashing

  • Central idea: replace dropout, adaptive dropout, or winner-take-all with a fast (sublinear time) hash based selection of active nodes based on approximate MIPS (maximum inner product search) using asymmetric locality-sensitive hashing.
    • This avoids a lot of the expensive inner-product multiply-accumulate work & energy associated with nodes that will either be completely off due to the ReLU or other nonlinearity -- or just not important for the algorithm + current input.
    • The result shows that you don't need very many neurons active in a given layer for successful training.
  • C.f: adaptive dropout adaptively chooses the nodes based on their activations. A few nodes are sampled from the network probabalistically based on the node activations dependent on their current input.
    • Adaptive dropouts demonstrate better performance than vanilla dropout [44]
    • It is possible to drop significantly more nodes adaptively than without while retaining superior performance.
  • WTA is an extreme form of adaptive dropout that uses mini-batch statistics to enforce a sparsity constraint. [28] {1507} Winner take all autoencoders
  • Our approach uses the insight that selecting a very sparse set of hidden nodes with the highest activations can be reformulated as dynamic approximate query processing, solvable with LSH.
    • LSH can be sub-linear time; normal processing involves the inner product.
    • LSH maps similar vectors into the same bucket with high probability. That is, it maps vectors into integers (bucket number)
  • Similar approach: Hashed nets [6], which aimed to decrease the number of parameters in a network by using a universal random hash function to tie weights. Compressing neural networks with the Hashing trick
    • "HashedNets uses a low-cost hash function to randomly group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value."
  • Ref [38] shows how asymmetric hash functions allow LSH to be converted to a sub-linear time algorithm for maximum inner product search (MIPS).
  • Used multi-probe LSH: rather than having a large number of hash tables (L) which increases hash time and memory use, they probe close-by buckets in the hash tables. That is, they probe bucket at B_j(Q) and those for slightly perturbed query Q. See ref [26].
  • See reference [2] for theory...
  • Following ref [42], use K randomized hash functions to generate the K data bits per vector. Each bit is the sign of the asymmetric random projection. Buckets contain a pointer to the node (neuron); only active buckets are kept around.
    • The K hash functions serve to increase the precision of the fingerprint -- found nodes are more expected to be active.
    • Have L hash tables for each hidden layer; these are used to increase the probability of finding useful / active nodes due to the randomness of the hash function.
    • Hash is asymmetric in the sense that the query and collection data are hashed independently.
  • In every layer during SGD, compute K x L hashes of the input, probe about 10 L buckets, and take their union. Experiments: K = 6 and L = 5.
  • See ref [30] where authors show around 500x reduction in computations for image search following different algorithmic and systems choices. Capsule: a camera based positioning system using learning {1506}
  • Use relatively small test data sets -- MNIST 8M, NORB, Convex, Rectangles -- each resized to have small-ish input vectors.

  • Really want more analysis of what exactly is going on here -- what happens when you change the hashing function, for example? How much is the training dependent on suitable ROC or precision/recall on the activation?
    • For example, they could have calculated the actual real activation & WTA selection, and compared it to the results from the hash function; how correlated are they?

hide / / print
ref: -2002 tags: hashing frequent items count sketch algorithm google date: 03-30-2020 02:04 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

Finding frequent items in data streams

  • Notation:
    • S is a data stream, S=q 1,q 2,...,q n S = q_1, q_2, ..., q_n length n.
    • Each object q iO=o 1,...o mq_i \in O = {o_1, ... o_m} That is, there are m total possible objects (e.g. English words).
    • Object o i o_i occurs n in_i times in S. The o no_n are ordered so that n 1n 2n m n_1 \geq n_2 \geq n_m .
  • Task:
    • Given an input stream S, integer k, and real ε\epsilon
    • Output a list of k elements from S such that each element has n i>(1ε)n k n_i \gt (1-\epsilon)n_k .
      • That is, if the ordering is perfect, n in k n_i \geq n_k , with equality on the last element.
  • Algorithm:
    • h 1,...,h th_1, ..., h_t hashes from object q to buckets 1,...,b{1, ..., b}
    • s 1,...,s ts_1, ..., s_t hashes from object q to 1,+1{-1, +1}
    • For each symbol, add it to the 2D hash array by hashing first with h ih_i , then increment that counter with s is_i .
      • The double-hasihing is to reduce the effect of collisions with high-frequency items.
    • When querying for frequency of a object, hash like others, and take the median over i of h i[q]*s i[q] h_i[q] * s_i[q]
    • t=O(log(nδ))t = O(log(\frac{n}{\delta})) where the algorithm fails with at most probability δ\delta
  • Demonstrate proof of convergence / function with Zipfian distributions with varying exponent. (I did not read through this).
  • Also showed that it's possible to compare these hash-counts directly to see what's changed,or importantly if the documents are different.

Mission: Ultra large-scale feature selection using Count-Sketches
  • Task:
    • Given a labeled dataset (X i,y i)(X_i, y_i) for i1,2,...,ni \in {1,2, ..., n} and X i p,y iX_i \in \mathbb{R}^p, y_i \in \mathbb{R}
    • Find the k-sparse feature vector / linear regression for the mean squares problem min||B|| 0=k||yXΒ|| 2 \frac{min}{||B||_0=k} ||y-X\Beta||_2
      • ||B|| 0=k ||B||_0=k counts the non-zero elements in the feature vector.
    • THE number of features pp is so large that a dense Β\Beta cannot be stored in memory. (X is of course sparse).
  • Such data may be from ad click-throughs, or from genomic analyses ...
  • Use the count-sketch algorithm (above) for capturing & continually updating the features for gradient update.
    • That is, treat the stream of gradient updates, in the normal form g i=2λ(y iX iΒ iX t) tX ig_i = 2 \lambda (y_i - X_i \Beta_i X^t)^t X_i , as the semi-continuous time series used above as SS
  • Compare this with greedy thresholding, Iterative hard thresholding (IHT) e.g. throw away gradient information after each batch.
    • This discards small gradients which may be useful for the regression problem.
  • Works better, but not necessarily better than straight feature hashing (FH).
  • Meh.

hide / / print
ref: -2015 tags: winner take all sparsity artificial neural networks date: 03-28-2020 01:15 gmt revision:0 [head]

Winner-take-all Autoencoders

  • During training of fully connected layers, they enforce a winner-take all lifetime sparsity constraint.
    • That is: when training using mini-batches, they keep the k percent largest activation of a given hidden unit across all samples presented in the mini-batch. The remainder of the activations are set to zero. The units are not competing with each other; they are competing with themselves.
    • The rest of the network is a stack of ReLU layers (upon which the sparsity constraint is applied) followed by a linear decoding layer (which makes interpretation simple).
    • They stack them via sequential training: train one layer from the output of another & not backprop the errors.
  • Works, with lower sparsity targets, also for RBMs.
  • Extended the result to WTA covnets -- here enforce both spatial and temporal (mini-batch) sparsity.
    • Spatial sparsity involves selecting the single largest hidden unit activity within each feature map. The other activities and derivatives are set to zero.
    • At test time, this sparsity constraint is released, and instead they use a 4 x 4 max-pooling layer & use that for classification or deconvolution.
  • To apply both spatial and temporal sparsity, select the highest spatial response (e.g. one unit in a 2d plane of convolutions; all have the same weights) for each feature map. Do this for every image in a mini-batch, and then apply the temporal sparsity: each feature map gets to be active exactly once, and in that time only one hidden unit (or really, one location of the input and common weights (depending on stride)) undergoes SGD.
    • Seems like it might train very slowly. Authors didn't note how many epochs were required.
  • This, too can be stacked.
  • To train on larger image sets, they first extract 48 x 48 patches & again stack...
  • Test on MNIST, SVHN, CIFAR-10 -- works ok, and well even with few labeled examples (which is consistent with their goals)

hide / / print
ref: -0 tags: VARNUM GEVI genetically encoded voltage indicators FRET Ace date: 03-18-2020 17:12 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-30420685 Fast in-vivo voltage imaging using a red fluorescent indicator

  • Kannan M, Vasan G, Huang C, Haziza S, Li JZ, Inan H, Schnitzer MJ, Pieribone VA.
  • Other genetically encoded voltage indicators (GEVI):
    • PMID-22958819 ArcLight (Peribone also last author) ; sign of ΔF/F\Delta F / F negative, but large, 35%! Slow tho? improvement in speed
    • ASAP3 ΔF/F\Delta F / F large, τ=3ms.\tau = 3 ms.
    • PMID-26586188 Ace-mNeon FRET based, Acetabularia opsin, fast kinetics + brightness of mNeonGreen.
    • Archon1 -- fast and sensitive, found (like VARNUM) using a robotic directed evolution or direct search strategy.
  • VARNAM is based on Acetabularia (Ace) + mRuby3, also FRET based, found via high-throughput voltage screen.
  • Archaerhodopsin require 1-12 W/mm^2 of illumination, vs. 50 mw/mm^2 for GFP based probes. Lots of light!
  • Systematic optimization of voltage sensor function: both the linker region (288 mutants), which affects FRET efficiency, as well as the opsin fluorophore region (768 mutants), which affects the wavelength of absorption / emission.
  • Some intracellular clumping (which will negatively affect sensitivity), but mostly localized to the membrane.
  • Sensitivity is still imperfect -- 4% in-vivo cortical neurons, though it’s fast enough to resolve 100 Hz spiking.
  • Can resolve post-synaptic EPSCs, but < 1 % ΔF/F\Delta F/F .
  • Tested all-optical ephys using VARNAM + blueshifted channelrhodopsin, CheRiff, both sparsely, and in PV targeted transgenetic model. Both work, but this is a technique paper; no real results.
  • Tested TEMPO fiber-optic recording in freely behaving mice (ish) -- induced ketamine waves, 0.5-4Hz.
  • And odor-induced activity in flies, using split-Gal4 expression tools. So many experiments.

hide / / print
ref: -2019 tags: Vale photostability bioarxiv DNA oragami photobleaching date: 03-10-2020 21:59 gmt revision:5 [4] [3] [2] [1] [0] [head]

A 6-nm ultra-photostable DNA Fluorocube for fluorescence imaging

  • Cy3n = sulfonated version of Cy3.
  • JF549 = azetidine modified version of tetramethyl rhodamine.

Also including some correspondence with the authors:


Nice work and nice paper, thanks for sharing .. and not at all what I had expected from Ron's comments! Below are some comments ... would love your opinion.

I'd expect that the molar absorption coefficients for the fluorocubes should be ~6x larger than for the free dyes and the single dye cubes (measured?), yet the photon yields for all except Cy3N maybe are around the yield for one dye molecule. So the quantum yield must be decreased by ~6x?

This in turn might be from a middling FRET which reduces lifetime, thereby the probability of ISC, photoelectron transfer, and hence photobleaching.

I wonder if in the case of ATTO 647N Cy5 and Cy3, the DNA is partly shielding the fluorphores from solvent (ala ethidium bromide), which also helps with stability, just like in fluorescent proteins. ATTO 647N generates a lot of singlet oxygen, who knows what it's doing to DNA.

Can you do a log-log autocorrelation of the blinking timeseries of the constructs? This may reveal different rate constants controlling dark/light states (though, for 6 coupled objects, might not be interpretable!)

Also, given the effect of DNA shielding, have you compared to free dyes to single-dye cubes other than supp fig 10? The fact that sulfonation made such a huge effect in brightness is suggestive.

Again, these are super interesting & exciting results!


I haven't directly looked at the molar absorption coefficient but judging from the data that I collected for the absorption spectra, there is certainly an increase for the fluorocubes compared to single dyes. I agree that this would be an interesting experiment and I am planning collect data to measure the molar absorption coefficient. I would also expect a ~6 fold increase for the Fluorocubes.

Yes, we suspect homo FRET to help reduce photobleaching. So far we only measured lifetimes in bulk but are planning to obtain lifetime data on the single-molecule level soon.

We also wondered if the DNA is providing some kind of shield for the fluorophores but could not design an experiment to directly test this hypothesis. If you have a suggestion, that would be wonderful.

The log-log autocorrelation of blinking events is indeed difficult to interpret. Already individual intensity traces of fluorocubes are difficult to analyze as many of them get brighter before they bleach. We are also wondering if some fluorocubes are emitting two photons simultaneously. We will hopefully be able to measure this soon.

hide / / print
ref: -0 tags: Na Ji 2p two photon fluorescent imaging pulse splitting damage bleaching date: 03-10-2020 21:44 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-18204458 High-speed, low-photodamage nonlinear imaging using passive pulse splitters

  • Core idea: take a single pulse and spread it out to N=2 kN= 2^k pulses using reflections and delay lines.
  • Assume two optical processes, signal SI αS \propto I^{\alpha} and photobleaching/damage DI βD \propto I^{\beta} , β>α>1\beta \gt \alpha \gt 1
  • Then an NN pulse splitter requires N 11/αN^{1-1/\alpha} greater average power but reduces the damage by N 1β/α.N^{1-\beta/\alpha}.
  • At constant signal, the same NN pulse splitter requires N\sqrt{N} more power, consistent with two photon excitation (proportional to the square of the intensity: N pulses of N/N\sqrt{N}/N intensity, 1/N per pulse fluorescence, Σ1\Sigma \rightarrow 1 overall fluorescence.)
  • This allows for shorter dwell times, higher power at the sample, lower damage, slower photobleaching, and better SNR for fluorescently labeled slices.
  • Examine the list of references too, e.g. "Multiphoton multifocal microscopy exploiting a diffractive optical element" (2003)

  • In practice, a pulse picker is useful when power is limited and bleaching is not a problem (as is with GCaMP6x)

hide / / print
ref: -0 tags: DNA paint FRET tag superresolution imaging oligos date: 02-20-2020 16:28 gmt revision:1 [0] [head]

Accelerated FRET-PAINT Microscopy

  • Well isn't that smart -- they use a FRET donor, which is free to associate and dissociate form a host DNA strand, and a more-permanently attached DNA acceptor, which blinks due to FRET, for superresolution imaging.
  • As FRET acceptors aren't subject to bleaching (or, perhaps, much less subject to bleaching), this eliminates that problem...
  • However, the light levels used ~1kW / cm^2, does damage the short DNA oligos, which interferes with reversible association.
  • Interestingly, CF488 donor showed very little photobleaching; DNA damage was instead the limiting problem.
    • Are dyes that bleach more slowly better at exporting their singlet oxygen (?) or aberrant excited states (?) to neighboring molecules?

hide / / print
ref: -0 tags: rhodamine derivatives imidazole bacterial resistance date: 02-19-2020 19:10 gmt revision:2 [1] [0] [head]

A diversity-oriented rhodamine library for wide-spectrum bactericidal agents with low inducible resistance against resistant pathogens

  • Tested a wide number of rhodamine derivatives, which were synthesized with a 'mild' route. This includes all sorts of substitutions on the carbon opposite the oxygen.
  • Tested the fluorescence properties ... many if not all are fluorescent. Supplementary information lists the abs/em spectra, which is kind of a goldmine (if it can be trusted).
  • No mention of light or dark in the paper. I suspect that these rhodamine derivatives are killing via singlet oxygen production. (Then again, I only skimmed the paper..)
    • Yes but: "Rhodamine dyes mainly adopted the ring-close forms exhibit no antibacterial activity against ATCC43300 or ATCC19606"
    • That's because they are colorless and can't emit any singlet oxygen!

hide / / print
ref: -0 tags: two photon scanning microscope mirror relay date: 01-31-2020 02:46 gmt revision:1 [0] [head]

PMID-24877017 Optimal lens design and use in laser-scanning microscopy

  • Detail careful design of a scanning two-photon microscope, with custom scan lens, tube lens, and standard 25x objective.
  • Near diffraction limited performance for both the scan and tube lenses across a broad excitation range -- 690 to 1400nm.
  • Interestingly, use a parabolic mirror relay to conjugate the two galvos to each other; seems like a good idea, why has this not been done elsewhere?

hide / / print
ref: -0 tags: lavis jf dyes fluorine zwitterion lactone date: 01-22-2020 20:06 gmt revision:0 [head]

Optimization and functionalization of red-shifted rhodamine dyes

  • Zwitterion form is fluorescent and colored; lactone form is not and colorless.
  • Lactone form is lipophyllic; some mix seems more bioavailable and also results in fluorogenic dyes.
  • Good many experiments with either putting fluorine on the azetidines or on the benzyl ring.
  • Fluorine on the azetidine pushes the K ZLK_{Z-L} toward lactone form; fluorine on the benzyl ring pushes it toward the zwitterion.
  • Si-rhodamine and P-rhodamine adopt the lactone form, and adding appropriate fluorines can make them fluorescent again. Which makes for good red-shifted dyes, ala JF669
  • N-CH3 can be substituted in the oxygen position too, resulting in blue-shifted dye which is a good stand-in for EGFP.

hide / / print
ref: -0 tags: multifactor synaptic learning rules date: 01-22-2020 01:45 gmt revision:9 [8] [7] [6] [5] [4] [3] [head]

Why multifactor?

  • Take a simple MLP. Let xx be the layer activation. X 0X^0 is the input, X 1X^1 is the second layer (first hidden layer). These are vectors, indexed like x i ax^a_i .
  • Then X 1=WX 0X^1 = W X^0 or x j 1=ϕ(Σ i=1 Nw ijx i 0)x^1_j = \phi(\Sigma_{i=1}^N w_{ij} x^0_i) . ϕ\phi is the nonlinear activation function (ReLU, sigmoid, etc.)
  • In standard STDP the learning rule follows Δwf(x pre(t),x post(t)) \Delta w \propto f(x_{pre}(t), x_{post}(t)) or if layer number is aa Δw a+1f(x a(t),x a+1(t))\Delta w^{a+1} \propto f(x^a(t), x^{a+1}(t))
    • (but of course nobody thinks there 'numbers' on the 'layers' of the brain -- this is just referring to pre and post synaptic).
  • In an artificial neural network, Δw aEw ij aδ j ax i \Delta w^a \propto - \frac{\partial E}{\partial w_{ij}^a} \propto - \delta_{j}^a x_{i} (Intuitively: the weight change is proportional to the error propagated from higher layers times the input activity) where δ j a=(Σ k=1 Nw jkδ k a+1)ϕ \delta_{j}^a = (\Sigma_{k=1}^{N} w_{jk} \delta_k^{a+1}) \partial \phi where ϕ\partial \phi is the derivative of the nonlinear activation function, evaluated at a given activation.
  • f(i,j)[x,y,θ,ϕ] f(i, j) \rightarrow [x, y, \theta, \phi]
  • k=13.165 k = 13.165
  • x=round(i/k) x = round(i / k)
  • y=round(j/k) y = round(j / k)
  • θ=a(ikx)+b(ikx) 2 \theta = a (\frac{i}{k} - x) + b (\frac{i}{k} - x)^2
  • ϕ=a(jky)+b(jky) 2 \phi = a (\frac{j}{k} - y) + b (\frac{j}{k} - y)^2

hide / / print
ref: -2017 tags: human level concept learning through probabalistic program induction date: 01-20-2020 15:45 gmt revision:0 [head]

PMID-26659050 Human level concept learning through probabalistic program induction

  • Preface:
    • How do people learn new concepts from just one or a few examples?
    • And how do people learn such abstract, rich, and flexible representations?
    • How can learning succeed from such sparse dataset also produce such rich representations?
    • For any theory of learning, fitting a more complicated model requires more data, not less, to achieve some measure of good generalization, usually in the difference between new and old examples.
  • Learning proceeds bu constructing programs that best explain the observations under a Bayesian criterion, and the model 'learns to learn' by developing hierarchical priors that allow previous experience with related concepts to ease learning of new concepts.
  • These priors represent learned inductive bias that abstracts the key regularities and dimensions of variation holding actoss both types of concepts and across instances.
  • BPL can construct new programs by reusing pieced of existing ones, capturing the causal and compositional properties of real-world generative processes operating on multiple scales.
  • Posterior inference requires searching the large combinatorial space of programs that could have generated a raw image.
    • Our strategy uses fast bottom-up methods (31) to propose a range of candidate parses.
    • That is, they reduce the character to a set of lines (series of line segments), then simply the intersection of those lines, and run a series of parses to estimate the generation of those lines, with heuristic criteria to encourage continuity (e.g. no sharp angles, penalty for abruptly changing direction, etc).
    • The most promising candidates are refined by using continuous optimization and local search, forming a discrete approximation to the posterior distribution P(program, parameters | image).

hide / / print
ref: -2017 tags: locality sensitive hashing olfaction kenyon cells neuron sparse representation date: 01-18-2020 21:13 gmt revision:1 [0] [head]

PMID-29123069 A neural algorithm for a fundamental computing problem

  • Ceneral idea: locality-sensitive hashing, e.g. hashing that is sensitive to the high-dimensional locality of the input space, can be efficiently solved using a circuit inspired by the insect olfactory system.
  • Here, activation of 50 different types of ORNs is mapped to 50 projection neurons, which 'centers the mean' -- concentration dependence is removed.
  • This is then projected via a random matrix of sparse binary weights to a much larger set of Kenyon cells, which in turn are inhibited by one APL neuron.
  • Normal locality-sensitive hashing uses dense matrices of Gaussian-distributed random weights, which means higher computational complexity...
  • ... these projections are governed by the Johnson-Lindenstrauss lemma, which says that projection from high-d to low-d space can preserve locality (distance between points) within an error bound.
  • Show that the WTA selection of the top 5% plus random binary weight preserves locality as measured by overlap with exact input locality on toy data sets, including MNIST and SIFT.
  • Flashy title as much as anything else got this into Science... indeed, has only been cited 6 times in Pubmed.

hide / / print
ref: -2014 tags: dopamine medium spiny neurons calcium STDP PKA date: 01-07-2020 03:43 gmt revision:2 [1] [0] [head]

PMID-25258080 A critical time window for dopamine actions on the structural plasticity of dendritic spines

  • Remarkably short time window for dopamine to modulate / modify (aggressive) STDP protocol.
  • Showed with the low-affinity calcium indicator Fluo4-FF that peak calcium concentrations in spines is not affected by optogenetic stimulation of dopamine fibers.
  • However, CaMKII activity is modulated by DA activity -- when glutamate uncaging and depolarization was followed by optogenetic stimulation of DA fibers followed, the FRET sensor Camui-CR reported significant increases of CaMKII activity.
  • This increase was abolished by the application of DRAPP-32 inhibiting peptide, which blocks the interaction of dopamine and cAMP-regulated phospoprotein - 32kDa (DRAPP-32) with protein phosphatase 1 (PP-1)
    • Spine enlargement was induced in the absence of optogenetic dopamine when PP-1 was inhibited by calculin A...
    • Hence, phosphorylation of DRAPP-32 by PKA inhibits PP-1 and disinihibts CaMKII. (This causal inference seems loopy; they reference a hippocampal paper, [18])
  • To further test this, they used a FRET probe of PKA activity, AKAR2-CR. This sensor showed that PKA activity extends throughout the dendrite, not just the stimulated spine, and can respond to DA release directly.

hide / / print
ref: -0 tags: nonlinear hebbian synaptic learning rules projection pursuit date: 12-12-2019 00:21 gmt revision:4 [3] [2] [1] [0] [head]

PMID-27690349 Nonlinear Hebbian Learning as a Unifying Principle in Receptive Field Formation

  • Here we show that the principle of nonlinear Hebbian learning is sufficient for receptive field development under rather general conditions.
  • The nonlinearity is defined by the neuron’s f-I curve combined with the nonlinearity of the plasticity function. The outcome of such nonlinear learning is equivalent to projection pursuit [18, 19, 20], which focuses on features with non-trivial statistical structure, and therefore links receptive field development to optimality principles.
  • Δwxh(g(w Tx))\Delta w \propto x h(g(w^T x)) where h is the hebbian plasticity term, and g is the neurons f-I curve (input-output relation), and x is the (sensory) input.
  • The relevant property of natural image statistics is that the distribution of features derived from typical localized oriented patterns has high kurtosis [5,6, 39]
  • Model is a generalized leaky integrate and fire neuron, with triplet STDP

hide / / print
ref: -2016 tags: spiking neural network self supervised learning date: 12-10-2019 03:41 gmt revision:2 [1] [0] [head]

PMID: Spiking neurons can discover predictive features by aggregate-label learning

  • This is a meandering, somewhat long-winded, and complicated paper, even for the journal Science. It's not been cited a great many times, but none-the-less is of interest.
  • The goal of the derived network is to detect fixed-pattern presynaptic sequences, and fire a prespecified number of spikes to each occurrence.
  • One key innovation is the use of a spike-threshold-surface for a 'tempotron' [12], the derivative of which is used to update the weights of synapses after trials. As the author says, spikes are hard to differentiate; the STS makes this more possible. This is hence standard gradient descent: if the neuron missed a spike then the weight is increased based on aggregate STS (for the whole trial -- hence the neuron / SGD has to perform temporal and spatial credit assignment).
    • As common, the SGD is appended with a momentum term.
  • Since STS differentiation is biologically implausible -- where would the memory lie? -- he also implements a correlational synaptic eligibility trace. The correlation is between the postsynaptic voltage and the EPSC, which seems kinda circular.
    • Unsurprisingly, it does not work as well as the SGD approximation. But does work...
  • Second innovation is the incorporation of self-supervised learning: a 'supervisory' neuron integrates the activity of a number (50) of feature detector neurons, and reinforces them to basically all fire at the same event, WTA style. This effects a unsupervised feature detection.
  • This system can be used with sort-of lateral inhibition to reinforce multiple features. Not so dramatic -- continuous feature maps.

Editorializing a bit: I said this was interesting, but why? The first part of the paper is another form of SGD, albeit in a spiking neural network, where the gradient is harder compute hence is done numerically.

It's the aggregate part that is new -- pulling in repeated patterns through synaptic learning rules. Of course, to do this, the full trace of pre and post synaptic activity must be recorded (??) for estimating the STS (i think). An eligibility trace moves in the right direction as a biologically plausible approximation, but as always nothing matches the precision of SGD. Can the eligibility trace be amended with e.g. neuromodulators to push the performance near that of SGD?

The next step of adding self supervised singular and multiple features is perhaps toward the way the brain organizes itself -- small local feedback loops. These features annotate repeated occurrences of stimuli, or tile a continuous feature space.

Still, the fact that I haven't seen any follow-up work is suggestive...

Editorializing further, there is a limited quantity of work that a single human can do. In this paper, it's a great deal of work, no doubt, and the author offers some good intuitions for the design decisions. Yet still, the total complexity that even a very determined individual can amass is limited, and likely far below the structural complexity of a mammalian brain.

This implies that inference either must be distributed and compositional (the normal path of science), or the process of evaluating & constraining models must be significantly accelerated. This later option is appealing, as current progress in neuroscience seems highly technology limited -- old results become less meaningful when the next wave of measurement tools comes around, irrespective of how much work went into it. (Though: the impedtus for measuring a particular thing in biology is only discovered through these 'less meaningful' studies...).

A third option, perhaps one which many theoretical neuroscientists believe in, is that there are some broader, physics-level organizing principles to the brain. Karl Friston's free energy principle is a good example of this. Perhaps at a meta level some organizing theory can be found, or likely a set of theories; but IMHO, you'll need at least one theory per brain area, at least, just the same as each area is morphologically, cytoarchitecturaly, and topologically distinct. (There may be only a few theories of the cortex, despite all the areas, which is why so many are eager to investigate it!)

So what constitutes a theory? Well, you have to meaningfully describe what a brain region does. (Why is almost as important; how more important to the path there.) From a sensory standpoint: what information is stored? What processing gain is enacted? How does the stored information impress itself on behavior? From a motor standpoint: how are goals selected? How are the behavioral segments to attain them sequenced? Is the goal / behavior even a reasonable way of factoring the problem?

Our dual problem, building the bridge from the other direction, is perhaps easier. Or it could be a lot more money has gone into it. Either way, much progress has been made in AI. One arm is deep function approximation / database compression for fast and organized indexing, aka deep learning. Many people are thinking about that; no need to add to the pile; anyway, as OpenAI has proven, the common solution to many problems is to simply throw more compute at it. A second is deep reinforcement learning, which is hideously sample and path inefficient, hence ripe for improvement. One side is motor: rather than indexing raw motor variables (LRUD in a video game, or joint torques with a robot..) you can index motor primitives, perhaps hierarchically built; likewise, for the sensory input, the model needs to infer structure about the world. This inference should decompose overwhelming sensory experience into navigable causes ...

But how can we do this decomposition? The cortex is more than adept at it, but now we're at the original problem, one that the paper above purports to make a stab at.

hide / / print
ref: -0 tags: dLight1 dopamine imaging Tian date: 12-05-2019 17:27 gmt revision:0 [head]

PMID-29853555 Ultrafast neuronal imaging of dopamine dynamics with designed genetically encoded sensors

  • cpGFP based sensor. ΔF/F~3\Delta F / F ~ 3 .

hide / / print
ref: -0 tags: surface plasmon resonance voltage sensing antennas PEDOT imaging spectroscopy date: 12-05-2019 16:47 gmt revision:1 [0] [head]

Electro-plasmonic nanoantenna: A nonfluorescent optical probe for ultrasensitive label-free detection of electrophysiological signals

  • Use spectroscopy to measure extracellular voltage, via plasmon concentrated electrochromic effects in doped PEDOT.

hide / / print
ref: -0 tags: multimode fiber imaging date: 11-15-2019 03:10 gmt revision:2 [1] [0] [head]

PMID-30588295 Subcellular spatial resolution achieved for deep-brain imaging in vivo using a minimally invasive multimode fiber

  • Oh wow wowww
  • Imaged through a 50um multimode optical fiber!
  • Multimode scattering matrix was inverted through a LC-SLM

hide / / print
ref: -0 tags: adaptive optics sensorless retina fluorescence imaging optimization zernicke polynomials date: 11-15-2019 02:51 gmt revision:0 [head]

PMID-26819812 Wavefront sensorless adaptive optics fluorescence biomicroscope for in vivo retinal imaging in mice

  • Idea: use backscattered and fluorescence light to optimize the confocal image through imperfect optics ... and the lens of the mouse eye.
    • Optimization was based on hill-climbing / line search of each Zernicke polynomial term for the deformable mirror. (The mirror had to be characterized beforehand, naturally).
    • No guidestar was needed!
  • Were able to resolve the dendritic processes of EGFP labeled Thy1 ganglion cells and Cx3 glia.

hide / / print
ref: -2019 tags: non degenerate two photon excitation fluorophores fluorescence OPO optical parametric oscillator date: 10-31-2019 20:53 gmt revision:0 [head]

Efficient non-degenerate two-photon excitation for fluorescence microscopy

  • Used an OPO + delay line to show that non-degenerate (e.g. photons of two different energies) can induce greater fluorescence, normalized to input energy, than normal same-energy excitation.

hide / / print
ref: -2015 tags: PaRAC1 photoactivatable Rac1 synapse memory optogenetics 2p imaging mouse motor skill learning date: 10-30-2019 20:35 gmt revision:1 [0] [head]

PMID-26352471 Labelling and optical erasure of synaptic memory traces in the motor cortex

  • Idea: use Rac1, which has been shown to induce spine shrinkage, coupled to a light-activated domain to allow for optogenetic manipulation of active synapses.
  • PaRac1 was coupled to a deletion mutant of PSD95, PSD delta 1.2, which concentrates at the postsynaptic site, but cannot bind to postsynaptic proteins, thus minimizing the undesirable effects of PSD-95 overexpression.
    • PSD-95 is rapidly degraded by proteosomes
    • This gives spatial selectivity.
  • They then exploited the dendritic targeting element (DTE) of Arc mRNA which is selectively targeted and translated in activiated dendritic segments in response to synaptic activation in an an NMDA receptor dependent manner.
    • Thereby giving temporal selectivity.
  • Construct is then PSD-PaRac1-DTE; this was tested on hippocampal slice cultures.
  • Improved sparsity and labelling further by driving it with the Arc promoter.
  • Motor learning is impaired in Arc KO mice; hence inferred that the induction of AS-PaRac1 by the Arc promoter would enhance labeling during learning-induced potentiation.
  • Delivered construct via in-utero electroporation.
  • Observed rotarod-induced learning; the PaRac signal decayed after two days, but the spine volume persisted in spines that showed Arc / DTE hence PA labeled activity.
  • Now, since they had a good label, performed rotarod training followed by (at variable delay) light pulses to activate Rac, thereby suppressing recently-active synapses.
    • Observed both a depression of behavioral performance.
    • Controlled with a second task; could selectively impair performance on one of the tasks based on ordering/timing of light activation.
  • The localized probe also allowed them to image the synapse populations active for each task, which were largely non-overlapping.

hide / / print
ref: -0 tags: carbon capture links date: 10-18-2019 14:20 gmt revision:0 [head]

Carbon capture links:

hide / / print
ref: -0 tags: Lucy Flavin mononucelotide FAD FMN fluorescent protein reporter date: 10-17-2019 19:54 gmt revision:1 [0] [head]

PMID-25906065 LucY: A Versatile New Fluorescent Reporter Protein

hide / / print
ref: -2019 tags: meta learning feature reuse deepmind date: 10-06-2019 04:14 gmt revision:1 [0] [head]

Rapid learning or feature reuse? Towards understanding the effectiveness of MAML

  • It's feature re-use!
  • Show this by freezing the weights of a 5-layer convolutional network when training on Mini-imagenet, either 5shot 1 way, or 5shot 5 way.
  • From this derive ANIL, where only the last network layer is updated in task-specific training.
  • Show that ANIL works for basic RL learning tasks.
  • This means that roughly the network does not benefit much from join encoding -- encoding both the task at hand and the feature set. Features can be learned independently from the task (at least these tasks), with little loss.

hide / / print
ref: -0 tags: ETPA entangled two photon absorption Goodson date: 09-24-2019 02:25 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Can we image biological tissue with entangled photons?

How much fluorescence can we expect, based on reasonable concentrations & published ETPA cross sections?

Start with beer's law: A=σLN A = \sigma L N AA = absorbance; LL = sample length, 10 μm, 1e-3 cm; NN = concentration, 10 μmol; σ\sigma = cross-section, for ETPA assume 2.4e18cm 2/molec2.4e-18 cm^2 / molec (this is based on a FMN based fluorophore; actual cross-section may be higher). Including Avogadro's number and 1l=1000cm 31 l = 1000 cm^3 , A=1.45e5A = 1.45e-5

Now, add in quantum efficiency ϕ=0.8\phi = 0.8 (Rhodamine); collection efficiency η=0.2\eta = 0.2 ; and an incoming photon pair flux of I=1e12photons/sec/modeI = 1e12 photons / sec / mode (which roughly about the limit for quantum behavior; n = 0.1 photons / mode; will add this calculation).

F=ϕησLNI=2.3e6photons/secF = \phi \eta \sigma L N I = 2.3e6 photons/sec This is very low, but within practical imaging limits. As a comparison, incoherent 2p imaging creates ~ 100 photons per pulse, of which 10 make it to the detector; for 512 x 512 pixels at 15fps, the dwell time on each pixel is 20 pulses of a 80 MHz Ti:Sapphire laser, or ~ 200 photons.

Note the pair flux is per optical mode; for a typical application, we'll use a Nikon 16x objective with a 600 μm Ø FOV and 0.8 NA. At 800 nm imaging wavelength, the diffraction limit is 0.5 μm. This equates to about 7e57e5 addressable modes in the FOV. Then an illumination of 1e121e12 photons / sec / mode equates to 7e177e17 photons over the whole field; if each photon pair has an energy of 2.75eV,λ=450nm2.75 eV, \lambda = 450 nm , this is equivalent to 300 mW. 100mW is a reasonable limit, hence scale incoming flux to 2.3e172.3e17 pairs /sec.

Hence, the imaging mode is power limited, and not quantum limited (if you could get such a bright entangled source). And right now that's the limit -- for a BBO crystal, circa 1998 experimenters were getting 1e4 photons / sec / mW. So, 2.3e172.3e17 pairs / sec would require 23 GW. Yikes.

More efficient entangled sources have been developed, using periodically-poled potassium titanyl phosphate (PPPTP), which (again assuming linearity) puts the power requirement at 23 MW. This is within the reason of q-switched lasers, but still incredibly inefficient. The down-conversion process is not linear in intensity, which is why Goodson pumps with SHG from a Ti:sapphire to yield ~1e7 photons; but this of induces temporal correlations which increase the frequency of incoherent TPA.

Still, combining PPPTP with a Ti:sapphire laser could result in 1e13 photons / sec, which is sufficient for scanned microscopy. Since the laser is pulsed, it will still be subject to incoherent TPA; but that's OK, the point is to reduce the power going into the animal via larger ETPA cross-section. The answer to above is a tentative yes. Upon the development of brighter entangled sources (e.g. arrays of quantum structures), this can move to fully widefield imaging.

hide / / print
ref: -0 tags: co2 capture entropy carbon dioxide date: 09-22-2019 00:46 gmt revision:1 [0] [head]

How much energy is thermodynamically required to concentrate CO 2CO_2 from one liter of air?

CO 2CO_2 concentration is 400ppm, or 0.4%. 1l of air is 1/22.4 or 44mMol. From wikipedia, the entropy of mixing is:

Δ mixS=nR(x 1ln(x 1)+x 2ln(x 2)) \Delta_{mix} S = n R (x_1 ln(x_1) + x_2 ln(x_2)) where x 1x_1 and x 2x_2 are the fraction of air and CO 2CO_2 (0.996 and 0.004)

This works out to 9.5e3J/K9.5e-3 J/K . At STP, 300K, this means you need only about 2.9J2.9 J to extract the carbon dioxide.

A car driving 1 km emits about 150g carbon dioxide. This is 3.4 moles, which will diffuse into 852 moles of air, or 19e3 liters of air (19 cubic meters). To pull this back out of the air then you'd need at minimum 55.3 kJ.

This is not much at all -- a car produces 100kW mechanical power, or 100kJ every second, and presumably it takes a minute to drive that 1km. But such perfectly efficient purification is not possible.

hide / / print
ref: -0 tags: ETPA entangled two photon absorption Goodson date: 09-19-2019 15:49 gmt revision:13 [12] [11] [10] [9] [8] [7] [head]

Various papers put out by the Goodson group:

And from a separate group at Northwestern:

  • Entangled Photon Resonance Energy Transfer in Arbitrary Media
    • Suggests three orders of magnitude improvement in cross-section relative to incoherent TPA.
    • In SPDC, photon pairs are generated randomly and usually accompanied by undesirable multipair emissions.
      • For solid-state artificial atomic systems with radiative cascades (singled quantum emitters like quantum dots), the quantum efficiency is near unity.
    • Paper is highly mathematical, and deals with resonance energy transfer (which is still interesting)

Regarding high fluence sources, quantum dots / quantum structures seem promising.

hide / / print
ref: -0 tags: betzig lattice light sheet date: 09-18-2019 18:32 gmt revision:0 [head]

PMID-25342811 Lattice Light Sheet Microscopy: Imaging Molecules to Embryos at High Spatiotemporal Resolution

hide / / print
ref: -2012 tags: cortex striatum learning carmena costa basal ganglia date: 09-13-2019 18:30 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-22388818 Corticostriatal plasticity is necessary for learning intentional neuroprosthetic skills.

  • Trained a mouse to control an auditory cursor, as in Kipke's task {99}. Did not cite that paper, claimed it was 'novel'. oops.
  • Summed neuronal firing rate of groups of 2 or 4 M1 neurons.
  • Auditory feedback was essential for the operant learning.
    • One group increased the frequency with increased firing rate; the other decreased tone with increasing FR.
  • Specific deletion of striatal NMDA receptors impairs the ability to learn neuroprosthetic skills.
    • Hence, they argue, cortico-striatal plastciity is required to learn abstract skills, such as this tone to firing rate target acquisition task.
  • Controlled by recording EMG of the vibrissae + injection of lidocane into the whisker pad.
  • One reward was sucrose solution; the other was a food pellet. When the rat was satiated on one modality, they showed increased preference for the opposite reward during BMI control -- thereby demonstrating intentionality. Clever!.
  • Noticed pronounced oscillatory spike coupling, the coherence of which was increased in low-frequency bands in late learning relative to early learning (figure 3).
  • Genetic manipulations: knockin line that expresses Cre recombinase in both striatonigral and striatopallidal medium spiny neurons, crossed with mice carrying a floxed allele of the NMDAR1 gene.
    • These animals are relatively normal, and can learn to perform rapid sequential movements, but are unable to learn precise motor sequences.
    • Acute pharmacological blockade of NMDAR did not affect performance of the neuroprosthetic skill.
    • Hence the deficits in the transgenic mice are due to an inability to perform the skill.

hide / / print
ref: Gage-2005.06 tags: naive coadaptive control Kalman filter Kipke audio BMI date: 09-13-2019 02:33 gmt revision:2 [1] [0] [head]

PMID-15928412[0] Naive coadaptive Control May 2005. see notes


hide / / print
ref: Jackson-2007.01 tags: Fetz neurochip sleep motor control BMI free behavior EMG date: 09-13-2019 02:21 gmt revision:4 [3] [2] [1] [0] [head]

PMID-17021028[0] Correlations Between the Same Motor Cortex Cells and Arm Muscles During a Trained Task, Free Behavior, and Natural Sleep in the Macaque Monkey

  • used their implanted "neurochip" recorder that recorded both EMG and neural activity. The neurochip buffers data and transmits via IR offline. It doesn't have all that much flash onboard - 16Mb.
    • used teflon-insulated 50um tungsten wires.
  • confirmed that there is a strong causal relationship, constant over the course of weeks, between motor cortex units and EMG activity.
    • some causal relationships between neural firing and EMG varied dependent on the task. Additive / multiplicative encoding?
  • this relationship was different at night, during REM sleep, though (?)
  • point out, as Todorov did, that Stereotyped motion imposes correlation between movement parameters, which could lead to spurrious relationships being mistaken for neural coding.
    • Experiments with naturalistic movement are essential for understanding innate, untrained neural control.
  • references {597} Suner et al 2005 as a previous study of long term cortical recordings. (utah probe)
  • during sleep, M1 cells exhibited a cyclical patter on quiescence followed by periods of elevated activity;
    • the cycle lasted 40-60 minutes;
    • EMG activity was seen at entrance and exit to the elevated activity period.
    • during periods of highest cortical activity, muscle activity was completely suppressed.
    • peak firing rates were above 100hz! (mean: 12-16hz).


hide / / print
ref: -2019 tags: Kleinfeld Harris record every neuron date: 09-13-2019 01:51 gmt revision:0 [head]

PMID-31495645 Can One Concurrently Record Electrical Spikes from Every Neuron in a Mammalian Brain?

  • Argues for a concrete arrangement of 6um diamond (1.2TPa modulus) shanks, 2mm long, on 40um hexagonal grid. Each would be patterned with 5 layers of metal, 30nm x 30nm Au traces (what about surface roughness?), high dielectric insulation, 9um x 14um TiN contacts.
  • This will be mated to state of the art adaptive amplifiers, which would be biased to only burn necessary power needed to sort spikes.
  • The sharpened spikes should penetrate the brain; 4um diameter diamond shanks should also work...
  • Overall volume displacement ~ 2% (which still seems high).
  • Suggest that the shanks can push capillaries out of the way, or puncture them while making a seal. Clearly, that's possible ...
  • ... but realistically, unless these are inserted glacially slowly, it will cause possibly catastrophic / cascading inflammation. (Which can spread on the order of 100-150um).
  • Does not cite Marblestone 2013.

hide / / print
ref: -0 tags: swept field confocal date: 09-12-2019 20:01 gmt revision:1 [0] [head]

PMID-22831554 Swept field laser confocal microscopy for enhanced spatial and temporal resolution in live-cell imaging.

  • Invented by Marvin Minsky back in 1955 memoir!
  • Idea is not unlike light-sheet imaging -- sweep a confocal slit and laser line across a sample, rather than a pinhole and point, respectively.
  • This results in lower phototoxicity, but still reasonable rejection of out-of-focus light compared to widefield imaging.

hide / / print
ref: -2017 tags: two photon holographic imaging Arch optogenetics GCaMP6 date: 09-12-2019 19:24 gmt revision:1 [0] [head]

PMID-28053310 Simultaneous high-speed imaging and optogenetic inhibition in the intact mouse brain.

  • Bovetti S1, Moretti C1, Zucca S1, Dal Maschio M1, Bonifazi P2,3, Fellin T1.
  • Image GCamp6 in either scanned mode (high resolution, slow) or holographically (SLM, redshirt 80x80 NeuroCCD, activate opsin Arch, simultaneously record juxtasomal action potentials.

hide / / print
ref: -2007 tags: photobleaching GFP date: 09-10-2019 01:42 gmt revision:1 [0] [head]

PMID-17179937 Major signal increase in fluorescence microscopy through dark-state relaxation (2007)

  • 5-25x increase in fluorescence yields.
  • Idea: allow the (dark) triplet states to decay naturally by keeping inter-pulse intervals of illumination greater than 1us.
  • Works for both 1p and 2p.
  • For volume imaging via 2p, I don’t think that 1um decay time is much of an issue; revisit given fluorophores after >1ms!
  • Suggests again that transition from triplet dark state to excited higher state is a prominent or significant cause of photobleaching; also suggests that triple quenching will have limited utility in scanned or pulsed 2p systems (will have more utility in 1p systems, perhaps..)
  • Atto532 dye has low intersystem crossing to the triplet state (1%) [3,5,14] .. humm.
  • 2p total photon emission seems to flatten above 100GW/cm^2 intensity.
  • 2p absorption is easily saturated independent of pulse width: for short pulses, high intensity leads to absorption to T1 state, which has high cross-section to the Tn>1 state; longer pulses give more time for single-photon absorption.
  • Ï„p by m = 200 and hence the pulse energy by 14-fold does not have a considerable effect on G2p. This obviously indicates that the saturation of the S0 → S1 or of the T1 → Tn > 1 excitation eliminates any dependence on pulse peak intensity or energy.

hide / / print
ref: -0 tags: computational neuroscience opinion tony zador konrad kording lillicrap date: 07-30-2019 21:04 gmt revision:0 [head]

Two papers out recently in Arxive and Biorxiv:

  • A critique of pure learning: what artificial neural networks can learn from animal brains
    • Animals learn rapidly and robustly, without the need for labeled sensory data, largely through innate mechanisms as arrived at and encoded genetically through evolution.
    • Still, this cannot account for the connectivity of the human brain, which is much to large for the genome; with us, there are cannonical circuits and patterns of intra-area connectivity which act as the 'innate' learning biases.
    • Mice and men are not so far apart evolutionary. (I've heard this also from people FIB-SEM imaging cortex) Hence, understanding one should appreciably lead us to understand the other. (I agree with this sentiment, but for the fact that lab mice are dumb, and have pretty stereotyped behaviors).
    • References Long short term memory and learning to learn in networks of spiking neurons -- which claims that a hybrid algorithm (BPTT with neuronal rewiring) with realistic neuronal dynamics markedly increases the computational power of spiking neural networks.
  • What does it mean to understand a neural network?
    • As has been the intuition with a lot of neuroscientists probably for a long time, posits that we have to investigate the developmental rules (wiring and connectivity, same as above) plus the local-ish learning rules (synaptic, dendritic, other .. astrocytic).
      • The weights themselves, in either biological neural networks, or in ANN's, are not at all informative! (Duh).
    • Emphasizes the concept of compressability: how much information can be discarded without impacting performance? With some modern ANN's, 30-50x compression is possible. Authors here argue that little compression is possible in the human brain -- the wealth of all those details about the world are needed! In other words, no compact description is possible.
    • Hence, you need to learn how the network learns those details, and how it's structured so that important things are learned rapidly and robustly, as seen in animals (very similar to above).

hide / / print
ref: -0 tags: python timelapse script date: 07-30-2019 20:45 gmt revision:3 [2] [1] [0] [head]

Edited Terrence Eden's script to average multiple frames when producing a time-lapse video from a continuous video. Frames are averaged together before decimation, rather than pure decimation, as with ffmpeg. Produces appealing results on subjects like water. Also, outputs a video directly, without having to write individual images.

import cv2
import sys

#   Video to read
print str(sys.argv[1])
vidcap = cv2.VideoCapture(sys.argv[1])

#   Which frame to start from, how many frames to go through
start_frame = 0
frames = 61000

#   Counters
count = 0
save_seq = 0
decimate = 10
rolling = 16 # average over N output frames
transpose = False

	h = vidcap.get(3)
	w = vidcap.get(4)
	w = vidcap.get(3)
	h = vidcap.get(4)

fourcc = cv2.VideoWriter_fourcc(*'mp4v')
writer = cv2.VideoWriter("timelapse.mp4", fourcc, 30, (int(w), int(h)), True)

avglist = []

while True:
	#   Read a frame
	success,image = vidcap.read()
	if not success:
	if count > start_frame+frames:
	if count >= start_frame:
		if (count % decimate == 0):
			#   Extract the frame and convert to float
			avg = image.astype('uint16') # max 255 frames averaged. 
		if (count % decimate > 0 and count % decimate <= (decimate-1)):
			avg = avg + image.astype('uint16')
		if (count % decimate == (decimate-1)):
			#   Every 100 frames (3 seconds @ 30fps)
			avg = avg / decimate
				avg = cv2.transpose(avg)
				avg = cv2.flip(avg, 1)
			avg2 = avg; 
			for a in avglist:
				avg2 = avg2 + a
			avg2 = avg2 / rolling; 
			if len(avglist) >= rolling:
				avglist.pop(0) # remove the first item. 
			avg2 = avg2.astype('uint8')
			print("saving "+str(save_seq))
			#   Save Image
			# cv2.imwrite(filename+str('{0:03d}'.format(save_seq))+".png", avg)
			save_seq += 1
			if count == frames + start_frame:
	count += 1

hide / / print
ref: -2019 tags: neuromorphic optical computing date: 06-19-2019 14:47 gmt revision:1 [0] [head]

Large-Scale Optical Neural Networks based on Photoelectric Multiplication

  • Critical idea: use coherent homodyne detection, and quantum photoelectric multiplication for the MACs.
    • That is, E-fields from coherent light multiplies rather than adds within a (logarithmic) photodiode detector.
    • Other lit suggests rather limited SNR for this effect -- 11db.
  • Hence need EO modulators and OE detectors followed by nonlinearity etc.
  • Pure theory, suggests that you can compute with as few as 10's of photons per MAC -- or less! Near Landauer's limit.

hide / / print
ref: -2016 tags: fluorescent proteins photobleaching quantum yield piston GFP date: 06-19-2019 14:33 gmt revision:0 [head]

PMID-27240257 Quantitative assessment of fluorescent proteins.

  • Cranfill PJ1,2, Sell BR1, Baird MA1, Allen JR1, Lavagnino Z2,3, de Gruiter HM4, Kremers GJ4, Davidson MW1, Ustione A2,3, Piston DW
  • Model bleaching as log(F)=αlog(P)+clog(F) = -\alpha log(P) + c or k bleach=bI αk_{bleach} = b I^{\alpha} where F is the fluorescence intensity, P is the illumination power, and b and c are constants.
    • Most fluorescent proteins have α\alpha > 1, which means superlinear photobleaching -- more power, bleaches faster.
  • Catalog the degree to which each protein tends to form aggregates by tagging to the ER and measuring ER morphology. Fairly thorough -- 10k cells each FP.

hide / / print
ref: -2017 tags: neuromorphic optical computing nanophotonics date: 06-17-2019 14:46 gmt revision:5 [4] [3] [2] [1] [0] [head]

Progress in neuromorphic photonics

  • Similar idea as what I had -- use lasers as the optical nonlinearity.
    • They add to this the idea of WDM and 'MRR' (micro-ring resonator) weight bank -- they don't talk about the ability to change the weihts, just specify them with some precision.
  • Definitely makes the case that III-V semiconductor integrated photonic systems have the capability, in MMACs/mm^2/pj, to exceed silicon.

See also :

hide / / print
ref: -2013 tags: microscopy space bandwidth product imaging resolution UCSF date: 06-17-2019 14:45 gmt revision:0 [head]

How much information does your microscope transmit?

  • Typical objectives 1x - 5x, about 200 Mpix!

hide / / print
ref: -0 tags: nanophotonics interferometry neural network mach zehnder interferometer optics date: 06-13-2019 21:55 gmt revision:3 [2] [1] [0] [head]

Deep Learning with Coherent Nanophotonic Circuits

  • Used a series of Mach-Zehnder interferometers with thermoelectric phase-shift elements to realize the unitary component of individual layer weight-matrix computation.
    • Weight matrix was decomposed via SVD into UV*, which formed the unitary matrix (4x4, Special unitary 4 group, SU(4)), as well as Σ\Sigma diagonal matrix via amplitude modulators. See figure above / original paper.
    • Note that interfereometric matrix multiplication can (theoretically) be zero energy with an optical system (modulo loss).
      • In practice, you need to run the phase-moduator heaters.
  • Nonlinearity was implemented electronically after the photodetector (e.g. they had only one photonic circuit; to get multiple layers, fed activations repeatedly through it. This was a demonstration!)
  • Fed network FFT'd / banded recordings of consonants through the network to get near-simulated vowel recognition.
    • Claim that noise was from imperfect phase setting in the MZI + lower resolution photodiode read-out.
  • They note that the network can more easily (??) be trained via the finite difference algorithm (e.g. test out an incremental change per weight / parameter) since running the network forward is so (relatively) low-energy and fast.
    • Well, that's not totally true -- you need to update multiple weights at once in a large / deep network to descend any high-dimensional valleys.

hide / / print
ref: -2012 tags: phase change materials neuromorphic computing synapses STDP date: 06-13-2019 21:19 gmt revision:3 [2] [1] [0] [head]

Nanoelectronic Programmable Synapses Based on Phase Change Materials for Brain-Inspired Computing

  • Here, we report a new nanoscale electronic synapse based on technologically mature phase change materials employed in optical data storage and nonvolatile memory applications.
  • We utilize continuous resistance transitions in phase change materials to mimic the analog nature of biological synapses, enabling the implementation of a synaptic learning rule.
  • We demonstrate different forms of spike-timing-dependent plasticity using the same nanoscale synapse with picojoule level energy consumption.
  • Again uses GST germanium-antimony-tellurium alloy.
  • 50pJ to reset (depress) the synapse, 0.675pJ to potentiate.
    • Reducing the size will linearly decrease this current.
  • Synapse resistance changes from 200k to 2M approx.

See also: Experimental Demonstration and Tolerancing of a Large-Scale Neural Network (165 000 Synapses) Using Phase-Change Memory as the Synaptic Weight Element

hide / / print
ref: -0 tags: optical gain media lasers cross section dye date: 06-13-2019 15:13 gmt revision:2 [1] [0] [head]

Eminently useful. Source: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-974-fundamentals-of-photonics-quantum-electronics-spring-2006/lecture-notes/chapter7.pdf

Laser Dye technology by Peter Hammond

  • This paper is another great resource!
  • Lists the stimulated emission cross-section for Rhodamine-6G as 4e-16 @ 550nm, consistent with the table above.
  • At a (high) concentration of 2mMol (1 g/l), 1/e penetration depth is 20um.
    • Depending on the solvent, there may be aggregation and stacking / quenching.
  • Tumbling time of Rhodamine 6G in ethanol is 20 to 300ps; fluorescence lifetime in oscillators is 10's of ps, so there is definitely polarization sensitive amplification.
  • Generally in dye lasers, the emission cross-section must be higher than the excited state absorption, σ eσ \sigma_e - \sigma^\star most important.
  • Bacteria can actually subsist on rhodamine-similar sulfonated dyes in aqueous solutions! Wow.

hide / / print
ref: -0 tags: lenslet optical processor date: 06-10-2019 04:26 gmt revision:0 [head]

Small gathering of links on Lenslet Labs / Lenslet inc. Founded in 1999.

  • Lenslet funding $26M 3rd round. November 2000.
  • vector-matrix optical multiplier Sept 2002
  • Patent on optical processor core. Idea includes 288 modulated VCSELs, 256x256 Multi-quantum-well modulators, photodiods, and lenses for splitting the light field or performing Fourier transforms.
  • Press release on the MQW SLM. 8 terra-ops. Jan 2004.
  • EnLight 64 press release, 240 billion ops/sec. Talks about having an actual software development platform, starting in Matlab. Photo of the device. Jan 1 2005.
  • Lenslet closes Lays off its last 30 employees; CEO is trying to liquidate IP, assets. March 2006.

hide / / print
ref: -2019 tags: optical neural networks spiking phase change material learning date: 06-01-2019 19:00 gmt revision:4 [3] [2] [1] [0] [head]

All-optical spiking neurosynaptic networks with self-learning capabilities

  • J. Feldmann, N. Youngblood, C. D. Wright, H. Bhaskaran & W. H. P. Pernice
  • Idea: use phase-change material to either block or pass the light in waveguides.
    • In this case, they used GST -- germanium-antimony-tellurium. This material is less reflective in the amorphous phase, which can be reached by heating to ~150C and rapidly quenching. It is more reflective in the crystalline phase, which occurs on annealing.
  • This is used for both plastic synapses (phase change driven by the intensity of the light) and the nonlinear output of optical neurons (via a ring resonator).
  • Uses optical resonators with very high Q factors to couple different wavelengths of light into the 'dendrite'.
  • Ring resonator on the output: to match the polarity of the phase-change material. Is this for reset? Storing light until trigger?
  • Were able to get correlative-like or hebbian learning (which I suppose is not dissimilar from really slow photographic film, just re-branded, and most importantly with nonlinear feedback.)
  • Issue: every weight needs a different source wavelength! Hence they have not demonstrated a multi-layer network.
  • Previous paper: All-optical nonlinear activation function for photonic neural networks
    • Only 3db and 7db extinction ratios for induced transparency and inverse saturation

hide / / print
ref: -0 tags: 3D SHOT Alan Hillel Waller 2p photon holography date: 05-31-2019 22:19 gmt revision:4 [3] [2] [1] [0] [head]

PMID-29089483 Three-dimensional scanless holographic optogenetics with temporal focusing (3D-SHOT).

  • Pégard NC1,2, Mardinly AR1, Oldenburg IA1, Sridharan S1, Waller L2, Adesnik H3,4
  • Combines computer-generated holography and temporal focusing for single-shot (no scanning) two-photon photo-activation of opsins.
  • The beam intensity profile determines the dimensions of the custom temporal focusing pattern (CTFP), while phase, a previously unused degree of freedom, is engineered to make 3D holograph and temporal focusing compatible.
  • "To ensure good diffraction efficiency of all spectral components by the SLM, we used a lens Lc to apply a small spherical phase pattern. The focal length was adjusted so that each spectral component of the pulse spans across the short axis of the SLM in the Fourier domain".
    • That is, they spatially and temporally defocus the pulse to better fill the SLM. The short axis of the SLM in this case is Y, per supplementary figure 2.
  • The image of the diffraction grating determines the plane of temporal focusing (with lenses L1 and L2); there is a secondary geometric focus due to Lc behind the temporal plane, which serves as an aberration.
  • The diffraction grating causes the temporal pattern to scan to produce a semi-spherical stimulated area ('disc').
  • Rather than creating a custom 3D holographic shape for each neuron, the SLM is after the diffraction grating -- it imposes phase and space modulation to the CTFP, effectively convolving it with a holograph of a cloud of points & hence replicating at each point.

hide / / print
ref: -0 tags: phosphorescence fluorescence magnetic imaging slicing adam cohen date: 05-29-2019 19:41 gmt revision:8 [7] [6] [5] [4] [3] [2] [head]

A friend postulated using the triplet state phosphorescence as a magnetically-modulatable dye. E.g. magnetically slice a scattering biological sample, rather than slicing optically (light sheet, 2p) or mechanically. After a little digging:

I'd imagine that it should be possible to design a molecule -- a protein cage, perhaps a (fully unsaturated) terpine -- which isolates the excited state from oxygen quenching.

Adam Cohen at Harvard has been working a bit on this very idea, albeit with fluorescence not phosphorescence --

  • Optical imaging through scattering media via magnetically modulated fluorescence (2010)
    • The two species, pyrene and dimethylaniline are in solution.
    • Dimethylaniline absorbs photons and transfers an electron to pyrene to produce a singlet radical pair.
    • The magnetic field represses conversion of this singlet into a triplet; when two singlet electrons combine, they produce exciplex fluorescence.
  • Addition of an aliphatic-ether 12-O-2 linker improves things significantly --
  • Mapping Nanomagnetic Fields Using a Radical Pair Reaction (2011)
  • Which can be used with a 2p microscope:
  • Two-photon imaging of a magneto-fluorescent indicator for 3D optical magnetometry (2015)
    • Notably, use decay kinetics of the excited state to yield measurements that are insensitive to photobleaching, indicator concentration, or local variations in optical excitation or collection efficiency. (As opposed to ΔF/F\Delta F / F )
    • Used phenanthrene (3 aromatic rings, not 4 in pyrene) as the excited electron acceptor, dimethylaniline again as the photo-electron generator.
    • Clear description:
      • A molecule with a singlet ground state absorbs a photon.
      • The photon drives electron transfer from a donor moiety to an acceptor moiety (either inter or intra molecular).
      • The electrons [ground state and excited state, donor] become sufficiently separated so that their spins do not interact, yet initially they preserve the spin coherence arising from their starting singlet state.
      • Each electron experiences a distinct set of hyperfine couplings to it's surrounding protons (?) leading to a gradual loss of coherence and intersystem crossing (ISC) into a triplet state.
      • An external magnetic field can lock the precession of both electrons to the field axis, partially preserving coherence and supressing ISC.
      • In some chemical systems, the triplet state is non-fluorescence, whereas the singlet pair can recombine and emit a photon.
      • Magnetochemical effects are remarkable because they arise at a magnetic field strengths comparable to hyperfine energy (typically 1-10mT).
        • Compare this to the Zeeman effect, where overt splitting is at 0.1T.
    • phenylanthrene-dimethylaniline was dissolved in dimethylformamide (DMF). The solution was carefully degassed in nitrogen to prevent molecular oxygen quenching.

Yet! Magnetic field effects do exist in solution:

hide / / print
ref: -2019 tags: super-resolution microscopy fluorescent protein molecules date: 05-28-2019 16:02 gmt revision:3 [2] [1] [0] [head]

PMID-30997987 Chemistry of Photosensitive Fluorophores for Single-Molecule Localization Microscopy

  • Excellent review of all the photo-convertable, photo-switchable, and more complex (photo-oxidation or reddening) of both proteins and small molecule fluorophore.
    • E.g. PA-GFP is one of the best -- good photoactivation quantum yield, good N ~ 300
    • Other small molecules, like Alexa Fluor 647 have a photon yield > 6700, which can be increased with triplet quenchers and antioxidants.
  • Describes the chemical mechanism of the various photo switching -- review is targeted at (bio)chemists interested in getting into imaging.
  • Emphasize that critical figures of merit are photoactivation quantum yield Φ pa\Phi_{pa} and N, overall photon yield before photobleaching.
  • See also Colorado lecture

hide / / print
ref: -2018 tags: Michael Levin youtube talk NIPS 2018 regeneration bioelectricity organism patterning flatworm date: 04-09-2019 18:50 gmt revision:1 [0] [head]

What Bodies Think About: Bioelectric Computation Outside the Nervous System - NeurIPS 2018

  • Short notes from watching the video, mostly interesting factoids: (This is a somewhat more coordinated narrative in the video. Am resisting ending each of these statements with and exclamation point).
  • Human children up to 7-11 years old can regenerate their fingertips.
  • Human embryos, when split in half early, develop into two normal humans; mouse embryos, when squished together, make one normal mouse.
  • Butterflies retain memories from their caterpillar stage, despite their brains liquefying during metamorphosis.
  • Flatworms are immortal, and can both grow and contract, as the environment requires.
    • They can also regenerate a whole body from segments, and know to make one head, tail, gut etc.
  • Single cell organisms, e.g. Lacrymaria, can have complex (and fast!) foraging / hunting plans -- without a brain or anything like it.
  • Axolotl can regenerate many parts of their body (appendages etc), including parts of the nervous system.
  • Frog embryos can self-organize an experimenter jumbled body plan, despite the initial organization having never been experienced in evolution.
  • Salamanders, when their tail is grafted into a foot/leg position, remodel the transplant into a leg and foot.
  • Neurotransmitters are ancient; fungi, who diverged from other forms of life about 1.5 billion years ago, still use the same set of inter-cell transmitters e.g. serotonin, which is why modulatory substances from them have high affinity & a strong effect on humans.
  • Levin, collaborators and other developmental biologists have been using voltage indicators in embryos ... this is not just for neurons.
  • Can make different species head shapes in flatworms by exposing them to ion-channel modulating drugs. This despite the fact that the respective head shapes are from species that have been evolving separately for 150 million years.
  • Indeed, you can reprogram (with light gated ion channels, drugs, etc) to body shapes not seen in nature or not explored by evolution.
    • That said, this was experimental, not by design; Levin himself remarks that the biology that generates these body plans is not known.
  • Flatworms can sore memory in bioelectric networks.
  • Frogs don't normally regenerate their limbs. But, with a drug cocktail targeting bioelectric signaling, they can regenerate semi-functional legs, complete with nerves, muscle, bones, and cartilage. The legs are functional (enough).
  • Manipulations of bioelectric signaling can reverse very serious genetic problems, e.g. deletion of Notch, to the point that tadpoles regain some ability for memory creation & recall.

  • I wonder how so much information can go through a the apparently scalar channel of membrane voltage. It seems you'd get symbol interference, and that many more signals would be required to pattern organs.
  • That said, calcium is used a great many places in the cell for all sorts of signaling tasks, over many different timescales as well, and it doesn't seem to be plagued by interference.
    • First question from the audience was how cells differentiate organismal patterning signals and behavioral signals, e.g. muscle contraction.

hide / / print
ref: -2017 tags: V1 V4 visual cortex granger causality date: 03-20-2019 06:00 gmt revision:0 [head]

PMID-28739915 Interactions between feedback and lateral connections in the primary visual cortex

  • Liang H1, Gong X1, Chen M2,3, Yan Y2,3, Li W4,3, Gilbert CD5.
  • Extracellular ephys on V1 and V4 neurons in macaque monkeys trained on a fixation and saccade task.
  • Contour task: monkeys had to select the patch of lines, chosen to stimulate the recorded receptive fields, which had a continuous contour in it (again chosen to elicit a response in the recorded V1 / V4 neurons).
    • Variable length of the contour: 1, 3, 5, 7 bars. First part of analysis: only 7-bar trials.
  • Granger causality (GC) in V1 horizontal connectivity decreased significantly in the 0-30Hz band after taking into account V4 activity. Hence, V4 explains some of the causal activity in V1.
    • This result holds both with contour-contour (e.g. cells both tuned to the contours in V1), contour-background, and background-background.
    • Yet there was a greater change in the contour-BG and BG-contour cells when V4 was taken into account (Granger causality is directional, like KL divergence).
      • This result passes the shuffle test, where tria identities were shuffled.
      • True also when LFP is measured.
      • That said .. even though GC is sensitive to temporal features, might be nice to control with a distant area.
      • See supplementary figures (of which there are a lot) for the controls.
  • Summarily: Feedback from V4 strengthens V1 lateral connections.
  • Then they looked at trials with a variable number of contour bars.
  • V4 seems to have a greater GC influence on background cells relative to contour cells.
  • Using conditional GC, lateral interactions in V1 contribute more to contour integration than V4.
  • Greater GC in correct trials than incorrect trials.

  • Note: differences in firing rate can affect estimation of GC. Hence, some advise using thinning of the spike trains to yield parity.
  • Note: refs for horizontal connections in V1 [7-10, 37]

hide / / print
ref: -2014 tags: gold nanowires intracellular recording korea date: 03-18-2019 23:02 gmt revision:1 [0] [head]

PMID-25112683 Subcellular Neural Probes from Single-Crystal Gold Nanowires

  • Korean authors... Mijeong Kang,† Seungmoon Jung,‡ Huanan Zhang,⊥ Taejoon Kang,∥ Hosuk Kang,† Youngdong Yoo,† Jin-Pyo Hong,# Jae-Pyoung Ahn,⊗ Juhyoun Kwak,† Daejong Jeon,‡* Nicholas A. Kotov,⊥* and Bongsoo Kim†*
  • 100nm single-crystal Au.
  • Able to get SUA despite size.
  • Springy, despite properties of bulk Au.
  • Nanowires fabricated on a sapphire substrae and picked up by a fine shapr W probe, then varnished with nail polish.

hide / / print
ref: -2011 tags: ttianium micromachining chlorine argon plasma etch oxide nitride penetrating probes Kevin Otto date: 03-18-2019 22:57 gmt revision:1 [0] [head]

PMID-21360044 Robust penetrating microelectrodes for neural interfaces realized by titanium micromachining

  • Patrick T. McCarthyKevin J. OttoMasaru P. Rao
  • Used Cl / Ar plasma to deep etch titanium film, 0.001 / 25um thick. Fine Metals Corp Ashland VA.
  • Discuss various insulation (oxide /nitride) failure modes, lithography issues.

hide / / print
ref: -0 tags: credit assignment distributed feedback alignment penn state MNIST fashion backprop date: 03-16-2019 02:21 gmt revision:1 [0] [head]

Conducting credit assignment by aligning local distributed representations

  • Alexander G. Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles
  • Propose two related algorithms: Local Representation Alignment (LRA)-diff and LRA-fdbk.
    • LRA-diff is basically a modified form of backprop.
    • LRA-fdbk is a modified version of feedback alignment. {1432} {1423}
  • Test on MNIST (easy -- many digits can be discriminated with one pixel!) and fashion-MNIST (harder -- humans only get about 85% right!)
  • Use a Cauchy or log-penalty loss at each layer, which is somewhat unique and interesting: L(z,y)= i=1 nlog(1+(y iz i) 2)L(z,y) = \sum_{i=1}^n{ log(1 + (y_i - z_i)^2)} .
    • This is hence a saturating loss.
  1. Normal multi-layer-perceptron feedforward network. pre activation h h^\ell and post activation z z^\ell are stored.
  2. Update the weights to minimize loss. This gradient calculation is identical to backprop, only they constrain the update to have a norm no bigger than c 1c_1 . Z and Y are actual and desired output of the layer, as commented. Gradient includes the derivative of the nonlinear activation function.
  3. Generaete update for the pre-nonlinearity h 1h^{\ell-1} to minimize the loss in the layer above. This again is very similar to backprop; its' the chain rule -- but the derivatives are vectors, of course, so those should be element-wise multiplication, not outer produts (i think).
    1. Note hh is updated -- derivatives of two nonlinearities.
  4. Feedback-alignment version, with random matrix E E_{\ell} (elements drawn from a gaussian distribution, σ=1\sigma = 1 ish.
    1. Only one nonlinearity derivative here -- bug?
  5. Move the rep and post activations in the specified gradient direction.
    1. Those h¯ 1\bar{h}^{\ell-1} variables are temporary holding -- but note that both lower and higher layers are updated.
  6. Do this K of times, K=1-50.
  • In practice K=1, with the LRA-fdbk algorithm, for the majority of the paper -- it works much better than LRA-diff (interesting .. bug?). Hence, this basically reduces to feedback alignment.
  • Demonstrate that LRA works much better with small initial weights, but basically because they tweak the algorithm to do this.
    • Need to see a positive control for this to be conclusive.
    • Again, why is FA so different from LRA-fdbk? Suspicious. Positive controls.
  • Attempted a network with Local Winner Take All (LWTA), which is a hard nonlinearity that LFA was able to account for & train through.
  • Also used Bernoulli neurons, and were able to successfully train. Unlike drop-out, these were stochastic at test time, and things still worked OK.

Lit review.
  • Logistic sigmoid can slow down learning, due to it's non-zero mean (Glorot & Bengio 2010).
  • Recirculation algorithm (or generalized recirculation) is a precursor for target propagation.
  • Target propagation is all about the inverse of the forward propagation: if we had access to the inverse of the network of forward propagations, we could compute which input values at the lower levels of the network would result in better values at the top that would please the global cost.
    • This is a very different way of looking at it -- almost backwards!
    • And indeed, it's not really all that different from contrastive divergence. (even though CD doesn't work well with non-Bernoulli units)
  • Contractive Hebbian learning also has two phases, one to fantasize, and done to try to make the fantasies look more like the input data.
  • Decoupled neural interfaces (Jaderberg et al 2016): learn a predictive model of error gradients (and inputs) nistead of trying to use local information to estimate updated weights.

  • Yeah, call me a critic, but I'm not clear on the contribution of this paper; it smells precocious and over-sold.
    • Even the title. I was hoping for something more 'local' than per-layer computation. BP does that already!
  • They primarily report supportive tests, not discriminative or stressing tests; how does the algorithm fail?
    • Certainly a lot of work went into it..
  • I still don't see how the computation of a target through a ransom matrix, then using delta/loss/error between that target and the feedforward activation to update weights, is much different than propagating the errors directly through a random feedback matrix. Eg. subtract then multiply, or multiply then subtract?

hide / / print
ref: -2011 tags: Andrew Ng high level unsupervised autoencoders date: 03-15-2019 06:09 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

Building High-level Features Using Large Scale Unsupervised Learning

  • Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeff Dean, Andrew Y. Ng
  • Input data 10M random 200x200 frames from youtube. Each video contributes only one frame.
  • Used local receptive fields, to reduce the communication requirements. 1000 computers, 16 cores each, 3 days.
  • "Strongly influenced by" Olshausen & Field {1448} -- but this is limited to a shallow architecture.
  • Lee et al 2008 show that stacked RBMs can model simple functions of the cortex.
  • Lee et al 2009 show that convolutonal DBN trained on faces can learn a face detector.
  • Their architecture: sparse deep autoencoder with
    • Local receptive fields: each feature of the autoencoder can connect to only a small region of the lower layer (e.g. non-convolutional)
      • Purely linear layer.
      • More biologically plausible & allows the learning of more invariances other than translational invariances (Le et al 2010).
      • No weight sharing means the network is extra large == 1 billion weights.
        • Still, the human visual cortex is about a million times larger in neurons and synapses.
    • L2 pooling (Hyvarinen et al 2009) which allows the learning of invariant features.
      • E.g. this is the square root of the sum of the squares of its inputs. Square root nonlinearity.
    • Local contrast normalization -- subtractive and divisive (Jarrett et al 2009)
  • Encoding weights W 1W_1 and deconding weights W 2W_2 are adjusted to minimize the reconstruction error, penalized by 0.1 * the sparse pooling layer activation. Latter term encourages the network to find invariances.
  • minimize(W 1,W 2) minimize(W_1, W_2) i=1 m(||W 2W 1 Tx (i)x (i)|| 2 2+λ j=1 kε+H j(W 1 Tx (i)) 2) \sum_{i=1}^m {({ ||W_2 W_1^T x^{(i)} - x^{(i)} ||^2_2 + \lambda \sum_{j=1}^k{ \sqrt{\epsilon + H_j(W_1^T x^{(i)})^2}} })}
    • H jH_j are the weights to the j-th pooling element, λ=0.1\lambda = 0.1 ; m examples; k pooling units.
    • This is also known as reconstruction Topographic Independent Component Analysis.
    • Weights are updated through asynchronous SGD.
    • Minibatch size 100.
    • Note deeper autoencoders don't fare consistently better.

hide / / print
ref: -2018 tags: biologically inspired deep learning feedback alignment direct difference target propagation date: 03-15-2019 05:51 gmt revision:5 [4] [3] [2] [1] [0] [head]

Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures

  • Sergey Bartunov, Adam Santoro, Blake A. Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap
  • As is known, many algorithms work well on MNIST, but fail on more complicated tasks, like CIFAR and ImageNet.
  • In their experiments, backprop still fares better than any of the biologically inspired / biologically plausible learning rules. This includes:
    • Feedback alignment {1432} {1423}
    • Vanilla target propagation
      • Problem: with convergent networks, layer inverses (top-down) will map all items of the same class to one target vector in each layer, which is very limiting.
      • Hence this algorithm was not directly investigated.
    • Difference target propagation (2015)
      • Uses the per-layer target as h^ l=g(h^ l+1;λ l+1)+[h lg(h l+1;λ l+1)]\hat{h}_l = g(\hat{h}_{l+1}; \lambda_{l+1}) + [h_l - g(h_{l+1};\lambda_{l+1})]
      • Or: h^ l=h l+g(h^ l+1;λ l+1)g(h l+1;λ l+1)\hat{h}_l = h_l + g(\hat{h}_{l+1}; \lambda_{l+1}) - g(h_{l+1};\lambda_{l+1}) where λ l\lambda_{l} are the parameters for the inverse model; g()g() is the sum and nonlinearity.
      • That is, the target is modified ala delta rule by the difference between inverse-propagated higher layer target and inverse-propagated higher level activity.
        • Why? h lh_{l} should approach h^ l\hat{h}_{l} as h l+1h_{l+1} approaches h^ l+1\hat{h}_{l+1} .
        • Otherwise, the parameters in lower layers continue to be updated even when low loss is reached in the upper layers. (from original paper).
      • The last to penultimate layer weights is trained via backprop to prevent template impoverishment as noted above.
    • Simplified difference target propagation
      • The substitute a biologically plausible learning rule for the penultimate layer,
      • h^ L1=h L1+g(h^ L;λ L)g(h L;λ L)\hat{h}_{L-1} = h_{L-1} + g(\hat{h}_L;\lambda_L) - g(h_L;\lambda_L) where there are LL layers.
      • It's the same rule as the other layers.
      • Hence subject to impoverishment problem with low-entropy labels.
    • Auxiliary output simplified difference target propagation
      • Add a vector zz to the last layer activation, which carries information about the input vector.
      • zz is just a set of random features from the activation h L1h_{L-1} .
  • Used both fully connected and locally-connected (e.g. convolution without weight sharing) MLP.
  • It's not so great:
  • Target propagation seems like a weak learner, worse than feedback alignment; not only is the feedback limited, but it does not take advantage of the statistics of the input.
    • Hence, some of these schemes may work better when combined with unsupervised learning rules.
    • Still, in the original paper they use difference-target propagation with autoencoders, and get reasonable stroke features..
  • Their general result that networks and learning rules need to be tested on more difficult tasks rings true, and might well be the main point of this otherwise meh paper.

hide / / print
ref: -2019 tags: lillicrap google brain backpropagation through time temporal credit assignment date: 03-14-2019 20:24 gmt revision:2 [1] [0] [head]

PMID-22325196 Backpropagation through time and the brain

  • Timothy Lillicrap and Adam Santoro
  • Backpropagation through time: the 'canonical' expansion of backprop to assign credit in recurrent neural networks used in machine learning.
    • E.g. variable rol-outs, where the error is propagated many times through the recurrent weight matrix, W TW^T .
    • This leads to the exploding or vanishing gradient problem.
  • TCA = temporal credit assignment. What lead to this reward or error? How to affect memory to encourage or avoid this?
  • One approach is to simply truncate the error: truncated backpropagation through time (TBPTT). But this of course limits the horizon of learning.
  • The brain may do BPTT via replay in both the hippocampus and cortex Nat. Neuroscience 2007, thereby alleviating the need to retain long time histories of neuron activations (needed for derivative and credit assignment).
  • Less known method of TCA uses RTRL Real-time recurrent learning forward mode differentiation -- δh t/δθ\delta h_t / \delta \theta is computed and maintained online, often with synaptic weight updates being applied at each time step in which there is non-zero error. See A learning algorithm for continually running fully recurrent neural networks.
    • Big problem: A network with NN recurrent units requires O(N 3)O(N^3) storage and O(N 4)O(N^4) computation at each time-step.
    • Can be solved with Unbiased Online Recurrent optimization, which stores approximate but unbiased gradient estimates to reduce comp / storage.
  • Attention seems like a much better way of approaching the TCA problem: past events are stored externally, and the network learns a differentiable attention-alignment module for selecting these events.
    • Memory can be finite size, extending, or self-compressing.
    • Highlight the utility/necessity of content-addressable memory.
    • Attentional gating can eliminate the exploding / vanishing / corrupting gradient problems -- the gradient paths are skip-connections.
  • Biologically plausible: partial reactivation of CA3 memories induces re-activation of neocortical neurons responsible for initial encoding PMID-15685217 The organization of recent and remote memories. 2005

  • I remain reserved about the utility of thinking in terms of gradients when describing how the brain learns. Correlations, yes; causation, absolutely; credit assignment, for sure. Yet propagating gradients as a means for changing netwrok weights seems at best a part of the puzzle. So much of behavior and internal cognitive life involves explicit, conscious computation of cause and credit.
  • This leaves me much more sanguine about the use of external memory to guide behavior ... but differentiable attention? Hmm.

hide / / print
ref: -2012 tags: DiCarlo Visual object recognition inferior temporal cortex dorsal ventral stream V1 date: 03-13-2019 22:24 gmt revision:1 [0] [head]

PMID-22325196 How Does the Brain Solve Visual Object Recognition

  • James DiCarlo, Davide Zoccolan, Nicole C Rust.
  • Infero-temporal cortex is organized into behaviorally relevant categories, not necessarily retinotopically, as demonstrated with TMS studies in humans, and lesion studies in other primates.
    • Synaptic transmission takes 1-2ms; dendritic propagation ?, axonal propagation ~1ms (e.g. pyramidal antidromic activation latency 1.2-1.3ms), so each layer can use several synapses for computation.
  • Results from the ventral stream computation can be well described by a firing rate code binned at ~ 50ms. Such a code can reliably describe and predict behavior
    • Though: this does not rule out codes with finer temporal resolution.
    • Though anyway: it may be inferential issue, as behavior operates at this timescale.
  • IT neurons' responses are sparse, but still contain information about position and size.
    • They are not narrowly tuned detectors, not grandmother cells; they are selective and complex but not narrow.
    • Indeed, IT neurons with the highest shape selectivities are the least tolerate to changes in position, scale, contrast, and visual clutter. (Zoccolan et al 2007)
    • Position information avoids the need to re-bind attributes with perceptual categories -- no need for syncrhony binding.
  • Decoded IT population activity of ~100 neurons exceeds artificial vision systems (Pinto et al 2010).
  • As in {1448}, there is a ~ 30x expansion of the number of neurons (axons?) in V1 vs the optic tract; serves to allow controlled sparsity.
  • Dispute in the field over primarily hierarchical & feed-forward vs. highly structured feedback being essential for performance (and learning?) of the system.
    • One could hypothesize that feedback signals help lower levels perform inference with noisy inputs; or feedback from higher layers, which is prevalent and manifest (and must be important; all that membrane is not wasted..)
    • DiCarlo questions if the re-entrant intra-area and inter-area communication is necessary for building object representations.
      • This could be tested with optogenetic approaches; since the publication, it may have been..
      • Feedback-type active perception may be evinced in binocular rivalry, or in visual illusions;
      • Yet 150ms immediate object recognition probably does not require it.
  • Authors propose thinking about neurons/local circuits as having 'job descriptions', an metaphor that couples neuroscience to human organization: who is providing feedback to the workers? Who is providing feeback as to job function? (Hinton 1995).
  • Propose local subspace untangling; when this is tacked and tiled, this is sufficient for object perception.
    • Indeed, modern deep convolutional networks behave this way; yet they still can't match human performance (perhaps not sparse enough, not enough representational capability)
    • Cite Hinton & Salakhutdinov 2006.
  • The AND-OR or conv-pooling architecture was proposed by Hubbel and Weisel back in 1962! In their paper's formulatin, they call it a Normalized non-linear model, NLN.
  1. Nonlinearities tend to flatten object manifolds; even with random weights, NLN models tend to produce easier to decode object identities, based on strength of normalization. See also {714}.
  2. NLNs are tuned / become tuned to the statistics of real images. But they do not get into discrimination / perception thereof..
  3. NLNs learn temporally: inputs that occur temporally adjacent lead to similar responses.
    1. But: scaades? Humans saccade 100 million times per year!
      1. This could be seen as a continuity prior: the world is unlikely to change between saccades, so one can infer the identity and positions of objects on the retina, which say can be used to tune different retinotopic IT neurons..
    2. See Li & DiCarlo -- manipulation of image statistics changing visual responses.
  • Regarding (3) above, perhaps attention is a modifier / learning gate?

hide / / print
ref: Schmidt-1978.09 tags: Schmidt BMI original operant conditioning cortex HOT pyramidal information antidromic date: 03-12-2019 23:35 gmt revision:11 [10] [9] [8] [7] [6] [5] [head]

PMID-101388[0] Fine control of operantly conditioned firing patterns of cortical neurons.

  • Hand-arm area of M1, 11 or 12 chronic recording electrodes, 3 monkeys.
    • But, they only used one unit at a time in the conditioning task.
  • Observed conditioning in 77% of single units and 65% of combined units (multiunits?).
  • Trained to move a handle to a position indicated by 8 annular cursor lights.
    • Cursor was updated at 50hz -- this was just a series of lights! talk about simple feedback...
    • Investigated different smoothing: too fast, FR does not stay in target; too slow, cursor acquires target too slowly.
      • My gamma function is very similar to their lowpass filter used for smoothing the firing rates.
    • 4 or 8 target random tracking task
    • Time-out of 8 seconds
    • Run of 40 trials
      • The conditioning reached a significant level of performance after 2.2 runs of 40 trials (in well-trained monkeys); typically, they did 18 runs/day (720 trials)
  • Recordings:
    • Scalar mapping of unit firing rate to cursor position.
    • Filtered 600-6kHz
    • Each accepted spike triggered a generator that produced a pulse of of constant amplitude and width -> this was fed into a lowpass filter (1.5 to 2.5 & 3.5Hz cutoff), and a gain stage, then a ADC, then (presumably) the PDP.
      • can determine if these units were in the pyramidal tract by measuring antidromic delay.
    • recorded one neuron for 108 days!!
      • Neuronal activity is still being recorded from one monkey 24 months after chronic implantation of the microelectrodes.
    • Average period in which conditioning was attempted was 3.12 days.
  • Successful conditioning was always associated with specific repeatable limb movements
    • "However, what appears to be conditioned in these experiments is a movement, and the neuron under study is correlated with that movement." YES.
    • The monkeys clearly learned to make (increasingly refined) movement to modulate the firing activity of the recorded units.
    • The monkey learned to turn off certain units with specific limb positions; the monkey used exaggerated movements for these purposes.
      • e.g. finger and shoulder movements, isometric contraction in one case.
  • Trained some monkeys or > 15 months; animals got better at the task over time.
  • PDP-12 computer.
  • Information measure: 0 bits for missed targets, 2 for a 4 target task, 3 for 8 target task; information rate = total number of bits / time to acquire targets.
    • 3.85 bits/sec peak with 4 targets, 500ms hold time
    • With this, monkeys were able to exert fine control of firing rate.
    • Damn! compare to Paninski! [1]
  • 4.29 bits/sec when the same task was performed with a manipulandum & wrist movement
  • they were able to condition 77% of individual neurons and 65% of combined units.
  • Implanted a pyramidal tract electrode in one monkey; both cells recorded at that time were pyramidal tract neurons, antidromic latencies of 1.2 - 1.3ms.
    • Failures had no relation to over movements of the monkey.
  • Fetz and Baker [2,3,4,5] found that 65% of precentral neurons could be conditioned for increased or decreased firing rates.
    • and it only took 6.5 minutes, on average, for the units to change firing rates!
  • Summarized in [1].


hide / / print
ref: -2018 tags: sparse representation auditory cortex excitatation inhibition balance date: 03-11-2019 20:47 gmt revision:1 [0] [head]

PMID-30307493 Sparse Representation in Awake Auditory Cortex: Cell-type Dependence, Synaptic Mechanisms, Developmental Emergence, and Modulation.

  • Sparse representation arises during development in an experience-dependent manner, accompanied by differential changes of excitatory input strength and a transition from unimodal to bimodal distribution of E/I ratios.

hide / / print
ref: -2015 tags: conjugate light electron tomography mouse visual cortex fluorescent label UNC cryoembedding date: 03-11-2019 19:37 gmt revision:1 [0] [head]

PMID-25855189 Mapping Synapses by Conjugate Light-Electron Array Tomography

  • Use aligned interleaved immunofluorescence imaging follwed by array EM (FESEM). 70nm thick sections.
  • Of IHC, tissue must be dehydrated & embedded in a resin.
  • However, the dehydration disrupts cell membranes and ultrastructural details viewed via EM ...
  • Hence, EM microscopy uses osmium tetroxide to cross-link the lipids.
  • ... Yet that also disrupt / refolds the poteins, making IHC fail.
  • Solution is to dehydrate & embed at cryo temp, -70C, where the lipids do not dissolve. They used Lowicryl HM-20.
  • We show that cryoembedding provides markedly improved ultrastructure while still permitting multiplexed immunohistochemistry.

hide / / print
ref: -2012 tags: octopamine STDP locust LTP LTD olfactory bulb date: 03-11-2019 18:59 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-22278062 Conditional modulation of spike-timing-dependent plasticity for olfactory learning.

  • Looked at the synapes from the Muschroom body (Kenyon cells, sparse code) to the beta-lobe (bLN) in locusts.
  • Used in-vivo dendrite patch, sharp micropipette.
  • Found that, with a controlled mushroom body extracellular stim for plasticity induction protocol at the KC-> bLN synapese, were able to get potentiation and depression in accord with STDP.
  • This STDP became pure depression in the presence of octopamine
  • See also / supercedes: Synaptic Learning Rules and Sparse Coding in a Model Sensory System Luca A. Finelli ,Seth Haney, Maxim Bazhenov, Mark Stopfer, Terrence J. Sejnowski 2008

hide / / print
ref: -2004 tags: Olshausen sparse coding review date: 03-08-2019 07:02 gmt revision:0 [head]

PMID-15321069 Sparse coding of sensory inputs

  • Classic review, Olshausen and Field. 15 years old now!
  • Note the sparsity here is in neuronal activation, not synaptic activity (though one should follow the other).
  • References Lewicki's auditory studies, Efficient coding of natural sounds 2002; properties of early auditory neurons are well suited for producing a sparse independent code.
    • Studies have found near binary encoding of stimuli in rat auditory cortex -- e.g. one spike per noise.
  • Suggests that overcomplete representations (e.g. where there are more 'second layer' neurons than inputs or pixels) are useful for flattening manifolds in the input space, making feature extraction easier.
    • But then you have an under-determined problem, where presumably sparsity metrics step in to restrict the actual coding space. Authors mention that this could lead to degeneracy.
    • Example is the early visual cortex, where axons to higher layers exceed those from the LGN by a factor of 25. Which, they say, may be a compromise between over-representation and degeneracy.
  • Sparse coding is a necessity from an energy standpoint -- only one in 50 neurons can be active at any given time.
  • Sparsity increases when classical receptive field stimuli in V1 is expanded with a real-world-statistics surround. (Gallant 2002).

hide / / print
ref: -2006 tags: Mark Bear reward visual cortex cholinergic date: 03-06-2019 04:54 gmt revision:1 [0] [head]

PMID-16543459 Reward timing in the primary visual cortex

  • Used 192-IgG-Saporin (saporin immunotoxin)to selectively lesion cholinergic fibers locally in V1 following a visual stimulus -> licking reward delay behavior.
  • Visual stimulus is full-field light, delivered to either the left or right eye.
    • This is scarcely a challenging task; perhaps they or others have followed up?
  • These examples illustrate that both cue 1-dominant and cue 2-dominant neurons recorded from intact animals express NRTs that appropriately reflect the new policy. Conversely, although cue 1- and cue 2-dominant neurons recorded from 192-IgG-saporin-infused animals are capable of displaying all forms of reward timing activity, ‘’’they do not update their NRTs but rather persist in reporting the now outdated policy.’’’
    • NRT = neural reaction time.
  • This needs to be controlled with recordings from other cortical areas.
  • Acquisition of reward based response is simultaneously interesting and boring -- what about the normal, discriminative and perceptual function of the cortex?
  • See also follow-up work PMID-23439124 A cholinergic mechanism for reward timing within primary visual cortex.

hide / / print
ref: -2017 tags: vicarious dileep george captcha message passing inference heuristic network date: 03-06-2019 04:31 gmt revision:2 [1] [0] [head]

PMID-29074582 A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs

  • Vicarious supplementary materials on their RCN (recursive cortical network).
  • Factor scene into shape and appearance, which CNN or DCNN do not do -- they conflate (ish? what about the style networks?)
    • They call this the coloring book approach -- extract shape then attach appearance.
  • Hierarchy of feature layers F frcF_{f r c} (binary) and pooling layer H frcH_{f r c} (multinomial), where f is feature, r is row, c is column (e.g. over image space).
  • Each layer is exclusively conditional on the layer above it, and all features in a layer are conditionally independent given the layer above.
  • Pool variables H frcH_{f r c} is multinomial, and each value associated with a feature, plus one off feature.
    • These features form a ‘pool’, which can/does have translation invariance.
  • If any of the pool variables are set to enable FF , then that feature is set (or-operation). Many pools can contain a given feature.
  • One can think of members of a pool as different alternatives of similar features.
  • Pools can be connected laterally, so each is dependent on the activity of its neighbors. This can be used to enforce edge continuity.
  • Each bottom-level feature corresponds to an edge, which defines ‘in’ and ‘out’ to define shape, YY .
  • These variables YY are also interconnected, and form a conditional random field, a ‘Potts model’. YY is generated by gibbs sampling given the F-H hierarchy above it.
  • Below Y, the per-pixel model X specifies texture with some conditional radial dependence.
  • The model amounts to a probabalistic model for which exact inference is impossible -- hence you must do approximate, where a bottom up pass estimates the category (with lateral connections turned off), and a top down estimates the object mask. Multiple passes can be done for multiple objects.
  • Model has a hard time moving from rgb pixels to edge ‘in’ and ‘out’; they use edge detection pre-processing stage, e.g. Gabor filter.
  • Training follows a very intuitive, hierarchical feature building heuristic, where if some object or collection of lower level features is not present, it’s added to the feature-pool tree.
    • This includes some winner-take-all heuristic for sparsification.
    • Also greedily learn some sort of feature ‘’dictionary’’ from individual unlabeled images.
  • Lateral connections are learned similarly, with a quasi-hebbian heuristic.
  • Neuroscience inspiration: see refs 9, 98 for message-passing based Bayesian inference.

  • Overall, a very heuristic, detail-centric, iteratively generated model and set of algorithms. You get the sense that this was really the work of Dileep George or only a few people; that it was generated by successively patching and improving the model/algo to make up for observed failures and problems.
    • As such, it offers little long-term vision for what is possible, or how perception and cognition occurs.
    • Instead, proof is shown that, well, engineering works, and the space of possible solutions -- including relatively simple elements like dictionaries and WTA -- is large and fecund.
      • Unclear how this will scale to even more complex real-world problems, where one would desire a solution that does not have to have each level carefully engineered.
      • Modern DCNN, at least, do not seem to have this property -- the structure is learned from the (alas, labeled) data.
  • This extends to the fact that yes, their purpose-built system achieves state of the art performance on the designated CAPATCHA tasks.
  • Check: B. M. Lake, R. Salakhutdinov, J. B. Tenenbaum, Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015). doi:10.1126/science.aab3050 Medline

hide / / print
ref: -2018 tags: cortex layer martinotti interneuron somatostatin S1 V1 morphology cell type morphological recovery patch seq date: 03-06-2019 02:51 gmt revision:3 [2] [1] [0] [head]

Neocortical layer 4 in adult mouse differs in major cell types and circuit organization between primary sensory areas

  • Using whole-cell recordings with morphological recovery, we identified one major excitatory and seven inhibitory types of neurons in L4 of adult mouse visual cortex (V1).
  • Nearly all excitatory neurons were pyramidal and almost all Somatostatin-positive (SOM+) neurons were Martinotti cells.
  • In contrast, in somatosensory cortex (S1), excitatory cells were mostly stellate and SOM+ cells were non-Martinotti.
  • These morphologically distinct SOM+ interneurons correspond to different transcriptomic cell types and are differentially integrated into the local circuit with only S1 cells receiving local excitatory input.
  • Our results challenge the classical view of a canonical microcircuit repeated through the neocortex.
  • Instead we propose that cell-type specific circuit motifs, such as the Martinotti/pyramidal pair, are optionally used across the cortex as building blocks to assemble cortical circuits.
  • Note preponderance of axons.
  • Classifications:
    • Pyr pyramidal cells
    • BC Basket cells
    • MC Martinotti cells
    • BPC bipolar cells
    • NFC neurogliaform cells
    • SC shrub cells
    • DBC double bouquet cells
    • HEC horizontally elongated cells.
  • Using Patch-seq

hide / / print
ref: -2012 tags: parvalbumin interneurons V1 perceptual discrimination mice date: 03-06-2019 01:46 gmt revision:0 [head]

PMID-22878719 Activation of specific interneurons improves V1 feature selectivity and visual perception

  • Lee SH1, Kwan AC, Zhang S, Phoumthipphavong V, Flannery JG, Masmanidis SC, Taniguchi H, Huang ZJ, Zhang F, Boyden ES, Deisseroth K, Dan Y.
  • Optogenetic Activation of PV+ interneurons improves neuronal feature selectivity and improves perceptual discrimination (!!!)

hide / / print
ref: -2016 tags: MAPseq Zador connectome mRNA plasmic library barcodes Peikon date: 03-06-2019 00:51 gmt revision:1 [0] [head]

PMID-27545715 High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

  • Justus M. Kebschull, Pedro Garcia da Silva, Ashlan P. Reid, Ian D. Peikon, Dinu F. Albeanu, Anthony M. Zador
  • Another tool for the toolboxes, but I still can't help but to like microscopy: while the number of labels in MAPseq is far higher, the information per read-oout is much lower; an imaged slice holds a lot of information, including dendritic / axonal morphology, which sequencing doesn't get. Natch, you'd wan to use both, or FISseq + ExM.

hide / / print
ref: -2019 tags: three photon imaging visual cortex THG chirp NOPA mice GCaMP6 MIT date: 03-01-2019 18:46 gmt revision:2 [1] [0] [head]

PMID-30635577 Functional imaging of visual cortical layers and subplate in awake mice with optimized three photon microscopy

  • Murat Yildirim, Hiroki Sugihara, Peter T.C. So & Mriganka Sur'
  • Used a fs Ti:Saphirre 16W pump into a non-colinear optical parametric amplifier (both from Spectra-Physics) to generate the 1300nm light.
  • Used pulse compensation to get the pulse width at the output of the objective to 40 fS.
    • Three-photon cross section is inverse quadratic in pulse width:
    • NP 3δ(τR) 2(NA 22hcλ) 3 N \sim \frac{P^3 \delta}{(\tau R)^2} (\frac{NA^2}{2hc\lambda})^3
    • P is power, δ\delta is 3p cross-section, τ\tau is pulse width, R repetition rate, NA is the numerical aperture (sixth power of NA!!!), h c and λ\lambda Planks constant, speed of light, and wavelength respectively.
  • Optimized excitation per depth by monitoring damage levels. varied from 0.5nJ to 5 nJ.
  • Imaged up to 1.5mm deep! All the way to the white matter / subplate.
  • Allegedly used a custom scan and tube lens to minimize aberrations in the excitation path (hence improve 3p excitation)
  • Layer 5 neurons are more broadly tuned for orientation than other layers. But the data is not dramatic.
  • Used straightforward metrics for tuning, using a positive and negative bump gaussian fit, then vector averaging to get global orientation selectivity.
  • Interesting that the variance between layers seems higher than between mice.

hide / / print
ref: -2017 tags: attention transformer language model youtube google tech talk date: 02-26-2019 20:28 gmt revision:3 [2] [1] [0] [head]

Attention is all you need

  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
  • Attention is all you need neural network models
  • Good summary, along with: The Illustrated Transformer (please refer to this!)
  • Łukasz Kaiser mentions a few times how fragile the network is -- how easy it is to make something that doesn't train at all, or how many tricks by google experts were needed to make things work properly. it might be bravado or bluffing, but this is arguably not the way that biology fails.
  • Encoding:
  • Input is words encoded as 512-length vectors.
  • Vectors are transformed into length 64 vectors: query, key and value via differentiable weight matrices.
  • Attention is computed as the dot-product of the query (current input word) with the keys (values of the other words).
    • This value is scaled and passed through a softmax function to result in one attentional signal scaling the value.
  • Multiple heads' output are concatenated together, and this output is passed through a final weight matrix to produce a final value for the next layer.
    • So, attention in this respect looks like a conditional gain field.
  • 'Final value' above is then passed through a single layer feedforward net, with resnet style jump.
  • Decoding:
  • Use the attentional key value from the encoder to determine the first word through the output encoding (?) Not clear.
  • Subsequent causal decodes depend on the already 'spoken' words, plus the key-values from the encoder.
  • Output is a one-hot softmax layer from a feedforward layer; the sum total is differentiable from input to output using cross-entropy loss or KL divergence.

hide / / print
ref: -2006 tags: hinton contrastive divergence deep belief nets date: 02-20-2019 02:38 gmt revision:0 [head]

PMID-16764513 A fast learning algorithm for deep belief nets.

  • Hinton GE1, Osindero S, Teh YW.
  • Very highly cited contrastive divergence paper.
  • Back in 2006 yielded state of the art MNIST performance.
  • And, being CD, can be used in an unsupervised mode.

hide / / print
ref: -2015 tags: CWEETS amplified Fourier imaging raman amplification date: 02-19-2019 06:46 gmt revision:1 [0] [head]

Amplified dispersive Fourier-Transform Imaging for Ultrafast Displacement sensing and Barcode Reading

hide / / print
ref: -2011 tags: HiLo speckle imaging confocal boston university optical sectioning date: 02-19-2019 06:18 gmt revision:2 [1] [0] [head]

PMID-21280920 Optically sectioned in vivo imaging with speckle illumination HiLo microscopy

  • Ah, brilliant! Illuminate a sample with a speckle pattern from a laser, and use this to optically section the data -- the contrast of the speckle pattern shows how in focus the sample is.
    • Hanece, the contrast indicates the in-focus vs out-of-focus ratio in a region.
  • The speckle statistics are invariant even in a scattering media, as scattering only further randomizes an already random laser phase front. (Within some limits.)
  • HiLo microscopy involves illuminating with a speckle pattern, then illuminating with standard uniform illumination, resulting in a diffraction-limited optically sectioned image. PMID-18709098
  • Algorithm is :
    • Take the speckle image and subtract the uniform image δI\delta I
    • Bandpass δI\delta I
    • Measure the standard deviation of the δI\delta I to get a weighting function C δs 2C^2_{\delta s}
    • Debias this estimate based on sensor..
    • Generate low-passed image from the weighted uniform image, LP[C δsI u] LP[C_{\delta s} I_u] , and high-pass from the difference HP=1LPHP = 1 - LP
    • Resultand image is a weighted sum of highpassed and lowpassed images.
  • Looks about as good as confocal.
  • Cited by...

hide / / print
ref: -0 tags: Airy light sheet microscopy attenuation compensation LSM imaging date: 02-19-2019 04:51 gmt revision:1 [0] [head]

Light-sheet microscopy with attenuation-compensated propagation-invariant beams

  • Ah ... beautiful illustration of the airy light sheet concept.
  • In practice, used a LCOS SLM to generate the beam (as .. phase matters!) plus an AOM to scan the beam.
    • Microscope can operate either in SPIM (single plane imaging microscope) or DSLM (digital scanning light sheet microscope),
  • Improves signal-to-background ratio (SBR) and contrast-to-noise ratio (CNR) (not sure why they don't use SNR..?)

hide / / print
ref: -0 tags: convolutional neural networks audio feature extraction vocals keras tensor flow fourier date: 02-18-2019 21:40 gmt revision:3 [2] [1] [0] [head]

Audio AI: isolating vocals from stereo music using Convolutional Neural Networks

  • Ale Koretzky
  • Fairly standard CNN, but use a binary STFT mask to isolate vocals from instruments.
    • Get Fourier-type time-domain artifacts as a results; but it sounds reasonable.
    • Didn't realize it until this paper / blog post: stacked conv layers combine channels.
    • E.g. Input size 513*25*16 513 * 25 * 16 (512 freq channels + DC, 25 time slices, 16 filter channels) into a 3x3 Conv2D -> 3*3*16+16=1603 * 3 * 16 + 16 = 160 total parameters (filter weights and bias).
    • If this is followed by a second Conv2D layer of the same parameters, the layer acts as a 'normal' fully connected network in the channel dimension.
    • This means there are (3*3*16)*16+16=2320(3 * 3 * 16) * 16 + 16 = 2320 parameters.
      • Each input channel from the previous conv layer has independent weights -- they are not shared -- whereas the spatial weights are shared.
      • Hence, same number of input channels and output channels (in this case; doesn't have to be).
      • This, naturally, falls out of spatial weight sharing, which might be obvious in retrospect; of course it doesn't make sense to share non-spatial weights.
      • See also: https://datascience.stackexchange.com/questions/17064/number-of-parameters-for-convolution-layers
  • Synthesized a large training set via acapella youtube videos plus instrument tabs .. that looked like a lot of work!
    • Need a karaoke database here.
  • Authors wrapped this into a realtime extraction toolkit.

hide / / print
ref: -2019 tags: Arild Nokland local error signals backprop neural networks mnist cifar VGG date: 02-15-2019 03:15 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Training neural networks with local error signals

  • Arild Nokland and Lars H Eidnes
  • Idea is to use one+ supplementary neural networks to measure within-batch matching loss between transformed hidden-layer output and one-hot label data to produce layer-local learning signals (gradients) for improving local representation.
  • Hence, no backprop. Error signals are all local, and inter-layer dependencies are not explicitly accounted for (! I think).
  • L simL_{sim} : given a mini-batch of hidden layer activations H=(h 1,...,h n)H = (h_1, ..., h_n) and a one-hot encoded label matrix Y=(y 1,...,y nY = (y_1, ..., y_n ,
    • L sim=||S(NeuralNet(H))S(Y)|| F 2 L_{sim} = || S(NeuralNet(H)) - S(Y)||^2_F (don't know what F is..)
    • NeuralNet()NeuralNet() is a convolutional neural net (trained how?) 3*3, stride 1, reduces output to 2.
    • S()S() is the cosine similarity matrix, or correlation matrix, of a mini-batch.
  • L pred=CrossEntropy(Y,W TH)L_{pred} = CrossEntropy(Y, W^T H) where W is a weight matrix, dim hidden_size * n_classes.
    • Cross-entropy is H(Y,W TH)=Σ i,jY i,jlog((W TH) i,j)+(1Y i,j)log(1(W TH) i,j) H(Y, W^T H) = \Sigma_{i,j} Y_{i,j} log((W^T H)_{i,j}) + (1-Y_{i,j}) log(1-(W^T H)_{i,j})
  • Sim-bio loss: replace NeuralNet()NeuralNet() with average-pooling and standard-deviation op. Plus one-hot target is replaced with a random transformation of the same target vector.
  • Overall loss 99% L simL_sim , 1% L predL_pred
    • Despite the unequal weighting, both seem to improve test prediction on all examples.
  • VGG like network, with dropout and cutout (blacking out square regions of input space), batch size 128.
  • Tested on all the relevant datasets: MNIST, Fashion-MNIST, Kuzushiji-MNIST, CIFAR-10, CIFAR-100, STL-10, SVHN.
  • Pretty decent review of similarity matching measures at the beginning of the paper; not extensive but puts everything in context.
    • See for example non-negative matrix factorization using Hebbian and anti-Hebbian learning in and Chklovskii 2014.
  • Emphasis put on biologically realistic learning, including the use of feedback alignment {1423}
    • Yet: this was entirely supervised learning, as the labels were propagated back to each layer.
    • More likely that biology is setup to maximize available labels (not a new concept).

hide / / print
ref: -2008 tags: representational similarity analysis fMRI date: 02-15-2019 02:27 gmt revision:1 [0] [head]

PMID-19104670 Representational Similarity Analysis – Connecting the Branches of Systems Neuroscience

  • Nikolaus Kriegeskorte, Marieke Mur, and Peter Bandettini
  • Alright, there seems to be no math in the article (?), but it seems well cited so best be on the radar.
  • RDM = representational dissimilarity matrices
    • Just a symmetric matrix of dissimilarity, e.g. correlation, euclidean distance, absolute activation distance ( L 1L_1 ?)
  • RSA = representational similarity analysis
    • Comparison of the upper triangle of two RDMs, using the same metrics.
    • Or, alternately, second-order isomorphism.
  • So.. high level:

hide / / print
ref: -0 tags: variational free energy inference learning bayes curiosity insight Karl Friston date: 02-15-2019 02:09 gmt revision:1 [0] [head]

PMID-28777724 Active inference, curiosity and insight. Karl J. Friston, Marco Lin, Christopher D. Frith, Giovanni Pezzulo,

  • This has been my intuition for a while; you can learn abstract rules via active probing of the environment. This paper supports such intuitions with extensive scholarship.
  • “The basic theme of this article is that one can cast learning, inference, and decision making as processes that resolve uncertanty about the world.
    • References Schmidhuber 1991
  • “A learner should choose a policy that also maximizes the learner’s predictive power. This makes the world both interesting and exploitable.” (Still and Precup 2012)
  • “Our approach rests on the free energy principle, which asserts that any sentient creature must minimize the entropy of its sensory exchanges with the world.” Ok, that might be generalizing things too far..
  • Levels of uncertainty:
    • Perceptual inference, the causes of sensory outcomes under a particular policy
    • Uncertainty about policies or about future states of the world, outcomes, and the probabilistic contingencies that bind them.
  • For the last element (probabilistic contingencies between the world and outcomes), they employ Bayesian model selection / Bayesian model reduction
    • Can occur not only on the data, but exclusively on the initial model itself.
    • “We use simulations of abstract rule learning to show that context-sensitive contingiencies, which are manifest in a high-dimensional space of latent or hidden states, can be learned with straightforward variational principles (ie. minimization of free energy).
  • Assume that initial states and state transitions are known.
  • Perception or inference about hidden states (i.e. state estimation) corresponds to inverting a generative model gievn a sequence of outcomes, while learning involves updating the parameters of the model.
  • The actual task is quite simple: central fixation leads to a color cue. The cue + peripheral color determines either which way to saccade.
  • Gestalt: Good intuitions, but I’m left with the impression that the authors overexplain and / or make the description more complicated that it need be.
    • The actual number of parameters to to be inferred is rather small -- 3 states in 4 (?) dimensions, and these parameters are not hard to learn by minimizing the variational free energy:
    • F=D[Q(x)||P(x)]E q[ln(P(o t|x)]F = D[Q(x)||P(x)] - E_q[ln(P(o_t|x)] where D is the Kullback-Leibler divergence.
      • Mean field approximation: Q(x)Q(x) is fully factored (not here). many more notes

hide / / print
ref: -0 tags: feedback alignment Arild Nokland MNIST CIFAR date: 02-14-2019 02:15 gmt revision:0 [head]

Direct Feedback alignment provides learning in deep neural nets

  • from {1423}
  • Feedback alignment is able to provide zero training error even in convolutional networks and very deep networks, completely without error back-propagation.
  • Biologically plausible: error signal is entirely local, no symmetric or reciprocal weights required.
    • Still, it requires supervision.
  • Almost as good as backprop!
  • Clearly written, easy to follow math.
    • Though the proof that feedback-alignment direction is within 90 deg of backprop is a bit impenetrable, needs some reorganization or additional exposition / annotation.
  • 3x400 tanh network tested on MNIST; performs similarly to backprop, if faster.
  • Also able to train very deep networks, on MNIST - CIFAR-10, CIFAR-100, 100 layers (which actually hurts this task).

hide / / print
ref: -2014 tags: Lillicrap Random feedback alignment weights synaptic learning backprop MNIST date: 02-14-2019 01:02 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-27824044 Random synaptic feedback weights support error backpropagation for deep learning.

  • "Here we present a surprisingly simple algorithm for deep learning, which assigns blame by multiplying error signals by a random synaptic weights.
  • Backprop multiplies error signals e by the weight matrix W T W^T , the transpose of the forward synaptic weights.
  • But the feedback weights do not need to be exactly W T W^T ; any matrix B will suffice, so long as on average:
  • e TWBe>0 e^T W B e &gt; 0
    • Meaning that the teaching signal Be B e lies within 90deg of the signal used by backprop, W Te W^T e
  • Feedback alignment actually seems to work better than backprop in some cases. This relies on starting the weights very small (can't be zero -- no output)

Our proof says that weights W0 and W
evolve to equilibrium manifolds, but simulations (Fig. 4) and analytic results (Supple-
mentary Proof 2) hint at something more specific: that when the weights begin near
0, feedback alignment encourages W to act like a local pseudoinverse of B around
the error manifold. This fact is important because if B were exactly W + (the Moore-
Penrose pseudoinverse of W ), then the network would be performing Gauss-Newton
optimization (Supplementary Proof 3). We call this update rule for the hidden units
pseudobackprop and denote it by ∆hPBP = W + e. Experiments with the linear net-
work show that the angle, ∆hFA ]∆hPBP quickly becomes smaller than ∆hFA ]∆hBP
(Fig. 4b, c; see Methods). In other words feedback alignment, despite its simplicity,
displays elements of second-order learning.

hide / / print
ref: -0 tags: betzig sparse and composite coherent lattices date: 02-14-2019 00:00 gmt revision:1 [0] [head]

Sparse and composite coherent lattices

  • Focused on the math:
    • Linear algebra to find the wavevectors from the Bravais primitive vectors;
    • Iterative maximization @ lattice points to find the electric field phase and amplitude
    • (Read paper for details)
  • High NA objective naturally converts plane wave to a spherical wave; this can be used to create spherically-constrained lattices at the focal point of objectives.

hide / / print
ref: -0 tags: diffraction terahertz 3d print ucla deep learning optical neural networks date: 02-13-2019 23:16 gmt revision:1 [0] [head]

All-optical machine learning using diffractive deep neural networks

  • Pretty clever: use 3D printed plastic as diffractive media in a 0.4 THz all-optical all-interference (some attenuation) linear convolutional multi-layer 'neural network'.
  • In the arxive publication there are few details on how they calculated or optimized given diffractive layers.
  • Absence of nonlinearity will limit things greatly.
  • Actual observed performance (where thy had to print out the handwritten digits) rather poor, ~ 60%.

hide / / print
ref: -2017 tags: calcium imaging seeded iterative demixing light field microscopy mouse cortex hippocampus date: 02-13-2019 22:44 gmt revision:1 [0] [head]

PMID-28650477 Video rate volumetric Ca2+ imaging across cortex using seeded iterative demixing (SID) microscopy

  • Tobias Nöbauer, Oliver Skocek, Alejandro J Pernía-Andrade, Lukas Weilguny, Francisca Martínez Traub, Maxim I Molodtsov & Alipasha Vaziri
  • Cell-scale imaging at video rates of hundreds of GCaMP6 labeled neurons with light-field imaging followed by computationally-efficient deconvolution and iterative demixing based on non-negative factorization in space and time.
  • Utilized a hybrid light-field and 2p microscope, but didn't use the latter to inform the SID algorithm.
  • Algorithm:
    • Remove motion artifacts
    • Time iteration:
      • Compute the standard deviation versus time (subtract mean over time, measure standard deviance)
      • Deconvolve standard deviation image using Richardson-Lucy algo, with non-negativity, sparsity constraints, and a simulated PSF.
      • Yields hotspots of activity, putative neurons.
      • These neuron lcoations are convolved with the PSF, thereby estimating its ballistic image on the LFM.
      • This is converted to a binary mask of pixels which contribute information to the activity of a given neuron, a 'footprint'
        • Form a matrix of these footprints, p * n, S 0S_0 (p pixels, n neurons)
      • Also get the corresponding image data YY , p * t, (t time)
      • Solve: minimize over T ||YST|| 2|| Y - ST||_2 subject to T0T \geq 0
        • That is, find a non-negative matrix of temporal components TT which predicts data YY from masks SS .
    • Space iteration:
      • Start with the masks again, SS , find all sets O kO^k of spatially overlapping components s is_i (e.g. where footprints overlap)
      • Extract the corresponding data columns t it_i of T (from temporal step above) from O kO^k to yield T kT^k . Each column corresponds to temporal data corresponding to the spatial overlap sets. (additively?)
      • Also get the data matrix Y kY^k that is image data in the overlapping regions in the same way.
      • Minimize over S kS^k ||Y kS kT k|| 2|| Y^k - S^k T^k||_2
      • Subject to S k>=0S^k &gt;= 0
        • That is, solve over the footprints S kS^k to best predict the data from the corresponding temporal components T kT^k .
        • They also impose spatial constraints on this non-negative least squares problem (not explained).
    • This process repeats.
    • allegedly 1000x better than existing deconvolution / blind source segmentation algorithms, such as those used in CaImAn

hide / / print
ref: -0 tags: Hinton google tech talk dropout deep neural networks Boltzmann date: 02-12-2019 08:03 gmt revision:2 [1] [0] [head]

Brains, sex, and machine learning -- Hinton google tech talk.

  • Hinton believes in the the power of crowds -- he thinks that the brain fits many, many different models to the data, then selects afterward.
    • Random forests, as used in predator, is an example of this: they average many simple to fit and simple to run decision trees. (is apparently what Kinect does)
  • Talk focuses on dropout, a clever new form of model averaging where only half of the units in the hidden layers are trained for a given example.
    • He is inspired by biological evolution, where sexual reproduction often spontaneously adds or removes genes, hence individual genes or small linked genes must be self-sufficient. This equates to a 'rugged individualism' of units.
    • Likewise, dropout forces neurons to be robust to the loss of co-workers.
    • This is also great for parallelization: each unit or sub-network can be trained independently, on it's own core, with little need for communication! Later, the units can be combined via genetic algorithms then re-trained.
  • Hinton then observes that sending a real value p (output of logistic function) with probability 0.5 is the same as sending 0.5 with probability p. Hence, it makes sense to try pure binary neurons, like biological neurons in the brain.
    • Indeed, if you replace the backpropagation with single bit propagation, the resulting neural network is trained more slowly and needs to be bigger, but it generalizes better.
    • Neurons (allegedly) do something very similar to this by poisson spiking. Hinton claims this is the right thing to do (rather than sending real numbers via precise spike timing) if you want to robustly fit models to data.
      • Sending stochastic spikes is a very good way to average over the large number of models fit to incoming data.
      • Yes but this really explains little in neuroscience...
  • Paper referred to in intro: Livnat, Papadimitriou and Feldman, PMID-19073912 and later by the same authors PMID-20080594
    • A mixability theory for the role of sex in evolution. -- "We define a measure that represents the ability of alleles to perform well across different combinations and, using numerical iterations within a classical population-genetic framework, show that selection in the presence of sex favors this ability in a highly robust manner"
    • Plus David MacKay's concise illustration of why you need sex, pg 269, __Information theory, inference, and learning algorithms__
      • With rather simple assumptions, asexual reproduction yields 1 bit per generation,
      • Whereas sexual reproduction yields G\sqrt G , where G is the genome size.

hide / / print
ref: -2019 tags: mosers hippocampus popsci nautilus grid cells date: 02-12-2019 07:32 gmt revision:1 [0] [head]

New Evidence for the Strange Geometry of Thought

  • Wow. Things are organized in 2d structures in the brain. The surprising thing about this article is that only the hiippocampus is mentioned, no discussion of the cortex. Well, it was written by a second year graduate student (though, admittedly, the writing style is perfectly fine.)

hide / / print
ref: -0 tags: superresolution imaging scanning lens nanoscale date: 02-04-2019 20:34 gmt revision:1 [0] [head]

PMID-27934860 Scanning superlens microscopy for non-invasive large field-of-view visible light nanoscale imaging

  • Recently, the diffraction barrier has been surpassed by simply introducing dielectrics with a micro-scale spherical configuration when using conventional optical microscopes by transforming evanescent waves into propagating waves. 18,19,20,21,22,23,24,25,26,27,28,29,30
  • The resolution of this superlens-based microscopy has been decreased to ∼50 nm (ref. 26) from an initial resolution of ∼200 nm (ref. 21).
  • This method can be further enhanced to ∼25 nm when coupled with a scanning laser confocal microscope 31.
  • It has achieved fast development in biological applications, as the sub-diffraction-limited resolution of high-index liquid-immersed microspheres has now been demonstrated23,32, enabling its application in the aqueous environment required to maintain biological activity.
  • Microlens is a 57 um diameter BaTiO3 microsphere, resolution of lambda / 6.3 under partial and inclined illumination
  • Microshpere is in contact with the surface during imaging, by gluing it to the cantilever tip of an AFM.
  • Get an image with the microsphere-lens, which improves imaging performance by ~ 200x. (with a loss in quality, naturally).

hide / / print
ref: -0 tags: Kato fear conditioning GABA auditory cortex mice optogenetics SOM PV date: 02-04-2019 19:09 gmt revision:0 [head]

PMID-29375323 Fear learning regulates cortical sensory representation by suppressing habituation

  • Trained mice on CS+ and CS --> lick task.
    • CS+ = auditory tone followed by tailshock
    • CS- = auditory tone (both FM modulated, separated by 0.5 - 1.0 octave).
    • US = licking.
  • VGAT2-ChR2 or PV-ChR2
  • GABA-ergic silencing of auditory cortex through blue light illumination abolished behavior difference following CS+ and CS-.
  • Used intrinsic imaging to locate A1 cortex, then AAV - GCaMP6 imaging to lcoated pyramidal cells.
  • In contrast to reports of enhanced tone responses following simple fear conditioning (Quirk et al., 1997; Weinberger, 2004, 2015), discriminative learning under our conditions caused no change in the average fraction of pyramidal cells responsive to the CS+ tone.
    • Seemed to be an increase in suppression, and reduced cortical responses, which is consistent with habituation.
  • Whereas -- and this is by no means surprising -- cortical responses to CS+ were sustained at end of tone following fear conditioning.
  • ----
  • Then examined this effect relative to the two populations of interneurons, using PV-cre and SOM-cre mice.
    • In PV cells, fear conditioning resulted in a decreased fraction of cells responsive, and a decreased magnitude of responses.
    • In SOM cells, CS- responses were enhanced, while CS+ were less enhanced (the main text seems like an exaggeration c.f. figure 6E)
  • This is possibly the more interesting result of the paper, but even then the result is not super strong.

hide / / print
ref: -0 tags: curiosity exploration forward inverse models trevor darrell date: 02-01-2019 03:42 gmt revision:1 [0] [head]

Curiosity-driven exploration by Self-supervised prediction

  • Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell
  • Key insight: “we only predict the changes in the environment that could possibly be due to actions of our agent or affect the agent, and ignore the rest”.
    • Instead of making predictions in the sensory space (e.g. pixels), we transform the sensory input into a feature space where only the information relevant to the agent is represented.
    • We learn this feature space using self-supervision -- training a neural network via a proxy inverse dynamics task -- predicting the agent’s action from the past and future sensory states.
  • We then use this inverse model to train a forward dynamics model to predict feature representation of the next state from present feature representation and action.
      • The difference between expected and actual representation serves as a reward signal for the agent.
  • Quasi actor-critic / adversarial agent design, again.
  • Used the asynchronous advantage actor critic policy gradient method (Mnih et al 2016 Asynchronous Methods for Deep Reinforcement Learning).
  • Compare with variational information maximization (VIME) trained with TRPO (Trust region policy optimization) which is “more sample efficient than A3C but takes more wall time”.
  • References / concurrent work: Several methods propose improving data efficiency of RL algorithms using self-supervised prediction based auxiliary tasks (Jaderberg et al., 2017; Shelhamer et al., 2017).
  • An interesting direction for future research is to use the learned exploration behavior / skill as a motor primitive / low level policy in a more complex, hierarchical system. For example, the skill of walking along corridors could be used as part of a navigation system.

hide / / print
ref: -0 tags: lillicrap segregated dendrites deep learning backprop date: 01-31-2019 19:24 gmt revision:2 [1] [0] [head]

PMID-29205151 Towards deep learning with segregated dendrites https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716677/

  • Much emphasis on the problem of credit assignment in biological neural networks.
    • That is: given complex behavior, how do upstream neurons change to improve the task of downstream neurons?
    • Or: given downstream neurons, how do upstream neurons receive ‘credit’ for informing behavior?
      • I find this a very limiting framework, and is one of my chief beefs with the work.
      • Spatiotemporal Bayesian structure seems like a much better axis (axes) to cast function against.
      • Or, it could be segregation into ‘signal’ and ‘error’ or ‘figure/ground’ based on hierarchical spatio-temporal statistical properties that matters ...
      • ... with proper integration of non-stochastic spike timing + neoSTDP.
        • This still requires some solution of the credit-assignment problem, i know i know.
  • Outline a spiking neuron model with zero one or two hidden layers, and a segregated apical (feedback) and basal (feedforward) dendrites, as per a layer 5 pyramidal neuron.
  • The apical dendrites have plateau potentials, which are stimulated through (random) feedback weights from the output neurons.
  • Output neurons are forced to one-hot activation at maximum firing rate during training.
    • In order to assign credit, feedforward information must be integrated separately from any feedback signals used to calculate error for synaptic updates (the error is indicated here with δ). (B) Illustration of the segregated dendrites proposal. Rather than using a separate pathway to calculate error based on feedback, segregated dendritic compartments could receive feedback and calculate the error signals locally.
  • Uses the MNIST database, naturally.
  • Poisson spiking input neurons, 784, again natch.
  • Derive local loss function learning rules to make the plateau potential (from the feedback weights) match the feedforward potential
    • This encourages the hidden layer -> output layer to approximate the inverse of the random feedback weight network -- which it does! (At least, the jacobians are inverses of each other).
    • The matching is performed in two phases -- feedforward and feedback. This itself is not biologically implausible, just unlikely.
  • Achieved moderate performance on MNIST, ~ 4%, which improved with 2 hidden layers.
  • Very good, interesting scholarship on the relevant latest findings ‘’in vivo’’.
  • While the model seems workable though ad-hoc or just-so, the scholarship points to something better: use of multiple neuron subtypes to accomplish different elements (variables) in the random-feedback credit assignment algorithm.
    • These small models can be tuned to do this somewhat simple task through enough fiddling & manual (e.g. in the algorithmic space, not weight space) backpropagation of errors.
  • They suggest that the early phases of learning may entail learning the feedback weights -- fascinating.
  • ‘’Things are definitely moving forward’’.

hide / / print
ref: -0 tags: STDP dopamine hippocampus date: 01-16-2019 21:56 gmt revision:1 [0] [head]

PMID-26516682 Retroactive modulation of spike timing-dependent plasticity by dopamine.

  • Here we show that dopamine, a positive reinforcement signal, can retroactively convert hippocampal timing-dependent synaptic depression into potentiation.
  • This effect requires functional NMDA receptors and is mediated in part through the activation of the cAMP/PKA cascade.
  • Mouse horizontal slices.
  • Plasticity induced by 100 pairings of a single EPSP followed by a postsynaptic spike (heavy-handed?)
  • Pre-before-post @ 10ms -> LTP
  • Post-before-pre @ -20ms -> LTD
  • Post-before-pre @ -10ms -> LTP (?!)
    • Addition of Dopamine antagonist (D2: sulpiride, D1/D5: SCH23390) prevented LTP and resulted in LTD.
  • Post-before-pre @ -20ms -> LTP in the presence of 20 uM DA.
    • The presence of DA during coordinated spiking activity widense the timing interval for induction of LTP.
  • What about if it's applied afterward?
  • 20 uM DA applied 1 minute (for 10-12 minutes) after LTD induction @ -20 mS converted LTD into LTP.
    • This was corrected by addition of the DA agonists.
    • Did not work if DA was applied 10 or 30 minutes after the LTD induction.
  • Others have shown that this requires functional NMDA receptors.
    • Application of NMDA agonist D-AP5 after post-before-pre -20ms did not affect LTD.
    • Application of D-AP5 before DA partially blocked conversion of LTD to LTP.
    • Application of D-AP5 alone before induction did not affect LTD.
  • This is dependent on the cAMP/PKA signaling cascade:
    • Application of forskolin (andenylyl cyclase AC activator) converts LTD -> LTP.
    • Dependent on NMDA.
  • PKA inhibitor H-89 alsoblocked LTD -> P.

hide / / print
ref: -0 tags: US employment top 100 bar chart date: 11-12-2018 00:02 gmt revision:1 [0] [head]

After briefly searching the web, I could not find a chart of the top 100 occupations in the US. After downloading the data from the US Bureau of Labor Statistics, made this chart:

Click for full-size.

Surprising how very service heavy our economy is.

hide / / print
ref: -0 tags: cutting plane manifold learning classification date: 10-31-2018 23:49 gmt revision:0 [head]

Learning data manifolds with a Cutting Plane method

  • Looks approximately like SVM: perform binary classification on a high-dimensional manifold (or sets of manifolds in this case).
  • The general idea behind Mcp_simple is to start with a finite number of training examples, find the maximum margin solution for that training set, augment the draining set by finiding a poing on the manifolds that violates the constraints, iterating the process until a tolerance criteria is met.
  • The more complicated cutting plane SVM uses slack variables to allow solution where classification is not linearly separable.
    • Propose using one slack variable per manifold, plus a manifold center, which strictly obeys the margin (classification) constraint.
  • Much effort put to proving the convergence properties of these algorithms; admittedly I couldn't be bothered to read...

hide / / print
ref: -0 tags: hahnloser zebrafinch LMAN HVC song learning internal model date: 10-12-2018 00:33 gmt revision:1 [0] [head]

PMID-24711417 Evidence for a causal inverse model in an avian cortico-basal ganglia circuit

  • Recorded an stimulated the LMAN (upstream, modulatory) region of the zebrafinch song-production & learning pathway.
  • Found evidence, albeit weak, for a mirror arrangement or 'causal inverse' there: neurons fire bursts prior syllable production with some motor delay, ~30ms, and also fire single spikes with a delay ~10 ms to the same syllables.
    • This leads to an overall 'mirroring offset' of about 40 ms, which is sufficiently supported by the data.
    • The mirroring offset is quantified by looking at the cross-covariance of audio-synchronized motor and sensory firing rates.
  • Causal inverse: a sensory target input generates a motor activity pattern required to cause, or generate that same sensory target.
    • Similar to the idea of temporal inversion via memory.
  • Data is interesting, but not super strong; per the discussion, the authors were going for a much broader theory:
    • Normal Hebbian learning says that if a presynaptic neuron fires before a postsynaptic neuron, then the synapse is potentiated.
    • However, there is another side of the coin: if the presynaptic neuron fires after the postsynaptic neuron, the synapse can be similarly strengthened, permitting the learning of inverse models.
      • "This order allows sensory feedback arriving at motor neurons to be associated with past postsynaptic patterns of motor activity that could have caused this sensory feedback. " So: stimulate the sensory neuron (here hypothetically in LMAN) to get motor output; motor output is indexed in the sensory space.
      • In mammals, a similar rule has been found to describe synaptic connections from the cortex to the basal ganglia [37].
      • ... or, based on anatomy, a causal inverse could be connected to a dopaminergic VTA, thereby linking with reinforcement learning theories.
      • Simple reinforcement learning strategies can be enhanced with inverse models as a means to solve the structural credit assignment problem [49].
  • Need to review literature here, see how well these theories of cortical-> BG synapse match the data.

hide / / print
ref: -0 tags: deeplabcut markerless tracking DCN transfer learning date: 10-03-2018 23:56 gmt revision:0 [head]

Markerless tracking of user-defined features with deep learning

  • Human - level tracking with as few as 200 labeled frames.
  • No dynamics - could be even better with a Kalman filter.
  • Uses a Google-trained DCN, 50 or 101 layers deep.
    • Network has a distinct read-out layer per feature to localize the probability of a body part to a pixel location.
  • Uses the DeeperCut network architecture / algorithm for pose estimation.
  • These deep features were trained on ImageNet
  • Trained on examples with both only the readout layers (rest fixed per ResNet), as well as end-to-end; latter performs better, unsurprising.

hide / / print
ref: -0 tags: NMDA spike hebbian learning states pyramidal cell dendrites date: 10-03-2018 01:15 gmt revision:0 [head]

PMID-20544831 The decade of the dendritic NMDA spike.

  • NMDA spikes occur in the finer basal, oblique, and tuft dendrites.
  • Typically 40-50 mV, up to 100's of ms in duration.
  • Look similar to cortical up-down states.
  • Permit / form the substrate for spatially and temporally local computation on the dendrites that can enhance the representational or computational repertoire of individual neurons.

hide / / print
ref: -0 tags: kernel regression structure discovery fitting gaussian process date: 09-24-2018 22:09 gmt revision:1 [0] [head]

Structure discovery in Nonparametric Regression through Compositional Kernel Search

  • Use Gaussian process kernels (squared exponential, periodic, linear, and ratio-quadratic)
  • to model a kernel function, k(x,x)k(x,x') which specifies how similar or correlated outputs yy and yy' are expected to be at two points $$x$ and xx' .
    • By defining the measure of similarity between inputs, the kernel determines the pattern of inductive generalization.
    • This is different than modeling the mapping y=f(x)y = f(x) .
    • It's something more like y=N(m(x)+k(x,x))y' = N(m(x') + k(x,x')) -- check the appendix.
    • See also: http://rsta.royalsocietypublishing.org/content/371/1984/20110550
  • Gaussian process models use a kernel to define the covariance between any two function values: Cov(y,y)=k(x,x)Cov(y,y') = k(x,x') .
  • This kernel family is closed under addition and multiplication, and provides an interpretable structure.
  • Search for kernel structure greedily & compositionally,
    • then optimize parameters with conjugate gradients with restarts.
    • This seems straightforwardly intuitive...
  • Kernels are scored with the BIC.
  • C.f. {842} -- "Because we learn expressions describing the covariance structure rather than the functions themselves, we are able to capture structure which does not have a simple parametric form."
  • All their figure examples are 1-D time-series, which is kinda boring, but makes sense for creating figures.
    • Tested on multidimensional (d=4) synthetic data too.
    • Not sure how they back out modeling the covariance into actual predictions -- just draw (integrate) from the distribution?

hide / / print
ref: work-0 tags: distilling free-form natural laws from experimental data Schmidt Cornell automatic programming genetic algorithms date: 09-14-2018 01:34 gmt revision:5 [4] [3] [2] [1] [0] [head]

Distilling free-form natural laws from experimental data

  • There critical step was to use partial derivatives to evaluate the search for invariants. Even yet, with a 4D data set the search for natural laws took ~ 30 hours.
    • Then again, how long did it take humans to figure out these invariants? (Went about it in a decidedly different way..)
    • Further, how long did it take for biology to discover similar invariants?
      • They claim elsewhere that the same algorithm has been applied to biological data - a metabolic pathway - with some success.
      • Of course evolution had to explore a much larger space - proteins and reculatory pathways, not simpler mathematical expressions / linkages.

hide / / print
ref: -0 tags: coevolution fitness prediction schmidt genetic algorithm date: 09-14-2018 01:34 gmt revision:8 [7] [6] [5] [4] [3] [2] [head]

Coevolution of Fitness Predictors

  • Michael D. Schmidt and Hod Lipson, Member, IEEE
  • Fitness prediction is a technique to replace fitness evaluation in evolutionary algorithms with a light-weight approximation that adapts with the solution population.
    • Cannot approximate the full landscape, but shift focus during evolution.
    • Aka local caching.
    • Or adversarial techniques.
  • Instead use coevolution, with three populations:
    • 1) solutions to the original problem, evaluated using only fitness predictors;
    • 2) fitness predictors of the problem; and
    • 3) fitness trainers, whose exact fitness is used to train predictors.
      • Trainers are selected high variance solutions across the predictors, and predictors are trained on this subset.
  • Lightweight fitness predictors evolve faster than the solution population, so they cap the computational effort on that at 5% overall effort.
    • These fitness predictors are basically an array of integers which index the full training set -- very simple and linear. Maybe boring, but the simplest solution that works ...
    • They only sample 8 training examples for even complex 30-node solution functions (!!).
    • I guess, because the information introduced into the solution set is relatively small per generation, it makes little sense to over-sample or over-specify this; all that matters is that, on average, it's directionally correct and unbiased.
  • Used deterministic crowding selection as the evolutionary algorithm.
    • Similar individuals have to compete in tournaments for space.
  • Showed that the coevolution algorithm is capable of inferring even highly complex many-term functions
    • And, it uses function evaluations more efficiently than the 'exact' (each solution evaluated exactly) algorithm.
  • Coevolution algorithm seems to induce less 'bloat' in the complexity of the solutions.
  • See also {842}

hide / / print
ref: -2018 tags: machine learning manifold deep neural net geometry regularization date: 08-29-2018 14:30 gmt revision:0 [head]

LDMNet: Low dimensional manifold regularized neural nets.

  • Synopsis of the math:
    • Fit a manifold formed from the concatenated input ‘’and’’ output variables, and use this set the loss of (hence, train) a deep convolutional neural network.
      • Manifold is fit via point integral method.
      • This requires both SGD and variational steps -- alternate between fitting the parameters, and fitting the manifold.
      • Uses a standard deep neural network.
    • Measure the dimensionality of this manifold to regularize the network. Using a 'elegant trick', whatever that means.
  • Still yet he results, in terms of error, seem not very significantly better than previous work (compared to weight decay, which is weak sauce, and dropout)
    • That said, the results in terms of feature projection, figures 1 and 2, ‘’do’’ look clearly better.
    • Of course, they apply the regularizer to same image recognition / classification problems (MNIST), and this might well be better adapted to something else.
  • Not completely thorough analysis, perhaps due to space and deadlines.

hide / / print
ref: -0 tags: tissue probe neural insertion force damage wound speed date: 06-02-2018 00:03 gmt revision:0 [head]

PMID-21896383 Effect of Insertion Speed on Tissue Response and Insertion Mechanics of a Chronically Implanted Silicon-Based Neural Probe

  • Two speeds, 10um/sec and 100um/sec, monitored out to 6 weeks.
  • Once the probes were fully advanced into the brain, we observed a decline in the compression force over time.
    • However, the compression force never decreased to zero.
    • This may indicate that chronically implanted probes experience a constant compression force when inserted in the brain, which may push the probe out of the brain over time if there is nothing to keep it in a fixed position.
      • Yet ... the Utah probe seems fine, up to many months in humans.
    • This may be a drawback for flexible probes [24], [25]. The approach to reduce tissue damage by reducing micromotion by not tethering the probe to the skull can also have this disadvantage [26]. Furthermore, the upward movement may lead to the inability of the contacts to record signals from the same neurons over long periods of time.
  • We did not observe a difference in initial insertion force, amount of dimpling, or the rest force after a 3-min rest period, but the force at the end of the insertion was significantly higher when inserting at 100 μm/s compared to 10 μm/s.
  • No significant difference in histological response observed between the two speeds.

hide / / print
ref: -0 tags: insertion speed needle neural electrodes force damage injury cassanova date: 06-01-2018 23:51 gmt revision:0 [head]

Effect of Needle Insertion Speed on Tissue Injury, Stress, and Backflow Distribution for Convection-Enhanced Delivery in the Rat Brain

  • Tissue damage, evaluated as the size of the hole left by the needle after retraction, bleeding, and tissue fracturing, was found to increase for increasing insertion speeds and was higher within white matter regions.
    • A statistically significant difference in hole areas with respect to insertion speed was found.
  • While there are no previous needle insertion speed studies with which to directly compare, previous electrode insertion studies have noted greater brain surface dimpling and insertion forces with increasing insertion speed [43–45]. These higher deformation and force measures may indicate greater brain tissue damage which is in agreement with the present study.
  • There are also studies which have found that fast insertion of sharp tip electrodes produced less blood vessel rupture and bleeding [28,29].
    • These differences in rate dependent damage may be due to differences in tip geometry (diameter and tip) or tissue region, since these electrode studies focus mainly on the cortex [28,29].
    • In the present study, hole measurements were small in the cortex, and no substantial bleeding was observed in the cortex except when it was produced during dura mater removal.
    • Any hemorrhage was observed primarily in white matter regions of the external capsule and the CPu.

hide / / print
ref: -0 tags: insertion speed neural electrodes force damage date: 06-01-2018 23:38 gmt revision:2 [1] [0] [head]

In vivo evaluation of needle force and friction stress during insertion at varying insertion speed into the brain

  • Targeted at CED procedures, but probably applicable elsewhere.
  • Used a blunted 32ga CA glue filled hypodermic needle.
  • Sprague-dawley rats.
  • Increased insertion speed corresponds with increased force, unlike cardiac tissue.
  • Greatuer surface dimpling before failure results in larger regions of deformed tissue and more energy storage before needle penetration.
  • In this study (blunt needle) dimpling increased with insertion speed, indicating that more energy was transferred over a larger region and increasing the potential for injury.
  • However, friction stresses likely decrease with insertion speed since larger tissue holes were measured with increasing insertion speeds indicating lower frictional stresses.
    • Rapid deformation results in greater pressurization of fluid filled spaces if fluid does not have time to redistribute, making the tissue effectively stiffer. This may occur in compacted tissues below or surrounding the needle and result in increasing needle forces with increasing needle speed.

hide / / print
ref: -2015 tags: ice charles lieber silicon nanowire probes su-8 microwire extracellular date: 05-30-2018 23:40 gmt revision:3 [2] [1] [0] [head]

PMID-26436341 Three-dimensional macroporous nanoelectronic networks as minimally invasive brain probes.

  • Xie C1, Liu J1, Fu TM1, Dai X1, Zhou W1, Lieber CM1,2.
  • Again, use silicon nanowire transistors as sensing elements. These seem rather good; can increase the signal, and do not suffer from shunt resistance / capacitance like wires.
    • They're getting a lot of mileage out of the technology; initial pub back in 2006.
  • Su-8, Cr/Pd/Cr (stress elements) and Cr/Au/Cr (conductor) spontaneously rolled into a ball, then the froze in LN2. Devices seemed robust to freezing in LN2.
  • 300-500nm Su-8 passivation layers, as with the syringe injectable electrodes.
  • 3um trace / 7um insulation (better than us!)
  • Used 100nm Ni release layer; thin / stiff enough Su-8 with rigid Si support chip permitted wirebonding a connector (!!)
    • Might want to use this as well for our electrodes -- of course, then we'd have to use the dicing saw, and free-etch away a Ni (or Al?) polyimide adhesion layer -- or use Su-8 like them. See figure S-4
  • See also {1352}

hide / / print
ref: -0 tags: tissue response indwelling implants dialysis kozai date: 04-04-2018 00:28 gmt revision:1 [0] [head]

PMID-25546652 Brain Tissue Responses to Neural Implants Impact Signal Sensitivity and Intervention Strategies

  • (Interesting): eight identical electrode arrays implanted into the same region of different animals have shown that half the arrays continue to record neural signals for >14 weeks while in the other half of the arrays, single-unit yield rapidly degraded and ultimately failed over the same timescale.
  • In another study, aimed at uncovering the time course of insertion-related bleeding and coagulation, electrodes were implanted into the cortex of rats at varying time intervals (−120, −90, −60, −30, −15, and 0 min) using a micromanipulator and linear motor with an insertion speed of 2 mm/s.40 The results showed dramatic variability in BBB leakage that washed out any trend (Figure 3), suggesting that a separate underlying cause was responsible for the large inter- and intra-animal variability.

hide / / print
ref: -0 tags: recurrent cortical model adaptation gain V1 LTD date: 03-27-2018 17:48 gmt revision:1 [0] [head]

PMID-18336081 Adaptive integration in the visual cortex by depressing recurrent cortical circuits.

  • Mainly focused on the experimental observation that decreasing contrast increases latency to both behavioral and neural response (latter in the later visual areas..)
  • Idea is that synaptic depression in recurrent cortical connections mediates this 'adaptive integration' time-constant to maintain reliability.
  • Model also explains persistent activity after a flashed stimulus.
  • No plasticity or learning, though.
  • Rather elegant and well explained.

hide / / print
ref: -2016 tags: somatostatin interneurons review date: 02-11-2018 18:08 gmt revision:0 [head]

PMID-27225074 Somatostatin-expressing neurons in cortical networks.

  • Urban-Ciecko J1, Barth AL1.
  • High (~ 10hz) tonic (constitutive) firing rate. All GABA.
  • Somatostatin, a neuropeptide, is of ill-defined role. Unknown when it is released.
  • SST interneurons receive diffuse input from cortical pyramidal cells, but each synapse is of low strength.
  • SST intererneurons are frequently electrically connected through gap junctions, but almost never through electrical synapses. The resulting network can extend for hundreds of microns, and has been shown to cause synchronized firing when cells are active.
  • Common anesthetics (isoflurane, urethane) profoundly silence the SSTs.
  • Wide diversity of axonal and dendritic branching patterns, targeting both apical (20%) and distal pyramidal cell dendrites.
  • SST neuron activity is reduced in Dravet syndrome.
  • SST neurons have also been implicated in schizophrenia; affected individuals show decreased SST mRNA and mislocalization of SST interneurons.

hide / / print
ref: -0 tags: NET probes SU-8 microfabrication sewing machine carbon fiber electrode insertion mice histology 2p date: 12-29-2017 04:38 gmt revision:1 [0] [head]

PMID-28246640 Ultraflexible nanoelectronic probes form reliable, glial scar–free neural integration

  • SU-8 asymptotic H2O absorption is 3.3% in PBS -- quite a bit higher than I expected, and higher than PI.
  • Faced yield problems with contact litho at 2-3um trace/space.
  • Good recordings out to 4 months!
  • 3 minutes / probe insertion.
  • Fab:
    • Ni release layer, Su-8 2000.5. "excellent tensile strength" --
      • Tensile strength 60 MPa
      • Youngs modulus 2.0 GPa
      • Elongation at break 6.5%
      • Water absorption, per spec sheet, 0.65% (but not PBS)
    • 500nm dielectric; < 1% crosstalk; see figure S12.
    • Pt or Au rec sites, 10um x 20um or 30 x 30um.
    • FFC connector, with Si substrate remaining.
  • Used transgenic mice, YFP expressed in neurons.
  • CA glue used before metabond, followed by Kwik-sil silicone.
  • Neuron yield not so great -- they need to plate the electrodes down to acceptable impedance. (figure S5)
    • Measured impedance ~ 1M at 1khz.
  • Unclear if 50um x 1um is really that much worse than 10um x 1.5um.
  • Histology looks realyl great, (figure S10).
  • Manuscript did not mention (though the did at the poster) problems with electrode pull-out; they deal with it in the same way, application of ACSF.

hide / / print
ref: Kim-2008.01 tags: PEDOT review soft date: 12-29-2017 04:34 gmt revision:4 [3] [2] [1] [0] [head]

PMID-21204405 Soft, Fuzzy, and Bioactive Conducting Polymers for Improving the Chronic Performance of Neural Prosthetic Devices.

  • lays out the soft electrode approach (obviously).
  • Extensive discussion of conductive polymer plating methods for neural electrodes.


[0] Kim DH, Richardson-Burns S, Povlich L, Abidian MR, Spanninga S, Hendricks JL, Martin DC, Soft, Fuzzy, and Bioactive Conducting Polymers for Improving the Chronic Performance of Neural Prosthetic Devicesno Source no Volume no Issue no Pages (2008)

hide / / print
ref: Salcman-1973.07 tags: Salcman MEA microelectrodes chronic recording glass cyanocrylate date: 12-29-2017 04:33 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

PMID-4708761 Design, Fabrication, and In Vivo Behavior of Chronic Recording Intracortical Microelectrodes

  • Teflon-coated 25um Pt-Ir (90/10)
  • Heat fuse this with a glass micropipette & backfill with cyanoacrylate. {1011}
    • Isobutyl acrylate is hydrolysed more slowly and hence is less toxic to the surronding tissue
    • cyanoacrylate is apparently biodegradable.
  • Durable, stable: one electrode displayed a single cortical spike (though not necessarily the same one) for more than 90 consecutive days.
  • unacceptably low impedance = 100K or less
  • Unit activity was present only 10-24H after surgery.
  • formal review of even older microelectrode studies.
  • 10nA should be 100x too small to have any effect on a platinum tip [17]
  • A seperable cell with a SNR of 3:1 would become lost if the electrode tip moved 15um away from a 20um soma.
    • "It becomes clear that the problem of holding single units for prolonged periods in the unrestrained animal is not achieved without considerable difficulty". Yet they think they have solved it.


Salcman, Michael and Bak, Martin J. Design, Fabrication, and In Vivo Behavior of Chronic Recording Intracortical Microelectrodes Biomedical Engineering, IEEE Transactions on BME-20 4 253 -260 (1973)

hide / / print
ref: -0 tags: robinson pasquali carbon nanotube fiber fluidic injection dextran neural electrode date: 12-28-2017 04:20 gmt revision:0 [head]

PMID-29220192 Fluidic Microactuation of Flexible Electrodes for Neural Recording.

  • Use viscous dextran solution + PDMS channel system
  • Durotomy (of course)
  • Parylene-C insulated carbon fiber electrodes, cut with FIB or razor blade
  • Used silver ink to electrically / mechanically attach for recordings.
  • Tested in hydra, rat brain slice (reticular formation of thalamus), and in-vivo rat.
  • Electrodes, at 12um diameter, E=120GPa, are approximately 127x stiffer than one 4x20um PI (E=9GPa) probe. Less damage though.

hide / / print
ref: -0 tags: Lieber nanoFET review silicon neural recording intracellular date: 12-28-2017 04:04 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-23451719 Synthetic Nanoelectronic Probes for Biological Cells and Tissue

  • Review of nanowireFETS for biological sensing
  • Silicon nanowires can be grown via vapor-liquid-solid or vapor-solid-solid, 1D catalyzed growth, usually with a Au nanoparticle.
  • Interestingly, kinks can be introduced via "iterative control over nucleation and growth", 'allowing the synthesis of complex 2D and 3D structures akin to organic chemistry"
    • Doping can similarly be introduced in highly localized areas.
    • This bottom-up synthesis is adaptable to flexible and organic substrates.
  • Initial tests used polylysine patterning to encourage axonal and dendritic growth across a nanoFET.
    • Positively charged amino group interacts with negative surface charge phospholipid
    • Lieber's group coats their SU-8 electrodes in poly-d-lysine as well {1352}
  • Have tested multiple configurations of the nanowire FET, including kinked, one with a SiO2 nanopipette channel for integration with the cell membrane, and one where the cell-attached fluid membrane functions as the semiconductor; see figure 4.
    • Were able to show recordings as one of the electrodes was endovascularized.
  • It's not entirely clear how stable and scalable these are; Si and SiO2 gradually dissolve in physiological fluid, and no mention was made of longevity.

hide / / print
ref: Gilgunn-2012 tags: kozai neural recording electrodes compliant parylene flexible dissolve date: 12-28-2017 03:50 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

IEEE-6170092 (pdf) An ultra-compliant, scalable neural probe with molded biodissolvable delivery vehicle

    • Optical coherence tomography is cool.
  • Large footprint - 150 or 300um, 135um thick (13500 or 40500 um^2; c.f. tungsten needle 1963 (50um) or 490 (25um) um^2.)
  • Delivery vehicle is fabricated from biodissolvable carboxy-methylcellulose (CMC).
    • Device dissolves within three minutes of implantation.
    • Yet stiff enough to penetrate the dura of rats (with what degree of dimpling?)
    • Lithographic patterning process pretty clever, actually.
    • Parylene-X is ~ 1.1 um thick.
    • 500nm Pt is patterned via ion milling with a photoresist mask.
    • Use thin 20nm Cr etch mask for both DRIE (STS ICP) and parylene etch.
  • Probes are tiny -- 10um wide, 2.7um thick, coated in parylene-X.
  • CMC polymer tends to bend and warp due to stress -- must be clamped in a special jig.
  • No histology. Follow-up: {1399}

hide / / print
ref: -0 tags: optogenetics micro LED flexible electrodes PET rogers date: 12-28-2017 03:24 gmt revision:9 [8] [7] [6] [5] [4] [3] [head]

PMID-23580530 Injectable, cellular-scale optoelectronics with applications for wireless optogenetics.

  • Supplementary materials
  • 21 authors, University Illinois at Urbana-Champaign, Tufts, China, Northwestern, Miami ..
  • GaN blue and green LEDs fabricated on a flexible substrate with stiff inserter.
    • Inserter is released in 15 min with a dissolving silk fibrin.
    • made of 250um thick SU-8 epoxy, reverse photocured on a glass slide.
  • GaN LEDS fabricated on a sapphire substrate & transfer printed via modified Karl-Suss mask aligner.
    • See supplemental materials for the intricate steps.
    • LEDs are 50um x 50um x 6.75um
  • Have integrated:
    • Temperature sensor (Pt serpentine resistor) / heater.
    • inorganic photodetector (IPD)
      • ultrathin silicon photodiode 1.25um thick, 200 x 200um^2, made on a SOI wafer
    • Pt extracellular recording electrode.
        • This insulated via 2um thick more SU-8.
  • Layers are precisely aligned and assembled via 500nm layer of epoxy.
    • Layers made of 6um or 2.5um thick mylar (polyethylene terephthalate (PET))
    • Layers joined with SU-8.
    • Wiring patterned via lift-off.
  • Powered via RF scavenging at 910 Mhz.
    • appeared to be simple, power in = light out; no data connection.
  • Tested vs control and fiber optic stimulation, staining for:
    • Tyrosine hydroxylase (makes l-DOPA)
    • c-fos, a neural activity marker
    • u-LEDs show significant activation.
  • Also tested for GFAP (astrocytes) and Iba1 (activated microglia); flexible & smaller devices had lower gliosis.
  • Next tested for behavior using a self-stimulation protocol; mice learned to self-stimulate to release DA.
  • Devices are somewhat reliable to 250 days!

hide / / print
ref: -0 tags: kozai CMC dissolving insertion shuttle parylene date: 12-28-2017 03:19 gmt revision:1 [0] [head]

PMID-25128375 Chronic tissue response to carboxymethyl cellulose based dissolvable insertion needle for ultra-small neural probes.

  • CMC = carboxymethyl cellulose, commonly used as a food additive, in toothpaste, etc.
  • To address CMC dissolution, we developed a sophisticated targeting, high speed insertion (∼80 mm/s), and release system to implant shuttles.
  • Cross section of the probes are large, 300 x 125um and 100 x 125um.
  • Beautiful histology: the wound does gradually close up as the CMC dissolves, but no e-phys.

hide / / print
ref: Kozai-2009.11 tags: electrodes insertion Kozai flexible polymer momolayer date: 12-28-2017 02:59 gmt revision:12 [11] [10] [9] [8] [7] [6] [head]

PMID-19666051[0] Insertion shuttle with carboxyl terminated self-assembled monolayer coatings for implanting flexible polymer neural probes in the brain.

  • This study investigated the use of an electronegative (hydrophillic) self-assembled monolayer (SAM) as a coating on a stiff insertion shuttle to carry a polymer probe into the cerebral cortex, and then the detachment of the shuttle from the probe by altering the shuttle's hydrophobicity.
    • Used 11-mercaptoundecanoic acid.
    • Cr/Au (of course) evaporated on 15um thick Si shuttle.
    • SAM attracts water once inserted, causing the hydrophobic polymer to move away.
      • Why not make the polymer hydrophillic?
      • Is this just soap?
  • Used agarose brain model.
  • Good list of references for the justification of soft electrodes, and researched means for addressing this, mostly usnig polymer stiffeners.
    • "Computer models and experimental studies of the probe–tissue interface suggest that flexible and soft probes that approach the brain’s bulk material characteristics may help to minimize micromotion between the probe and surrounding tissue ({737}; {1203}; {1102}; {1200}; LaPlaca et al., 2005; {1216}; Neary et al., 2003 PMID-12657694; {1198})"
  • "However, polymer probes stick to metallic and silicon surfaces through hydrophobic interactions, causing the polymer probe to be carried out of the brain when the insertion shuttle is removed. The solution is to use a highly hydrophillic, electronegative, self-assembled monolayer coating on the shuttle.
  • Biran et al 2005 suggests that incremental damage due to stab wounds from the shuttle (needle) should be minor.
  • Probes: 12.5 um thick, 196 um wide, and 1.2cm long, polymide substrate and custom designed lithographed PDMS probes.
  • Polymer probes were inserted deep - 8.5 mm.
  • PDMS probes inserted with non-coated insertion shuttle resulted in explantation of the PDMS probe.


[0] Kozai TD, Kipke DR, Insertion shuttle with carboxyl terminated self-assembled monolayer coatings for implanting flexible polymer neural probes in the brain.J Neurosci Methods 184:2, 199-205 (2009 Nov 15)

hide / / print
ref: -0 tags: platinum parylene electrodes brush dissolving stiffener gelatin date: 12-28-2017 02:44 gmt revision:0 [head]

PMID-27159159 Embedded Ultrathin Cluster Electrodes for Long-Term Recordings in Deep Brain Centers.

  • 12.5um pure Pt wires
  • Coated in 4um parylene-C
  • stiffened with gelatin
  • further protected with Kollicoat to retard dissolution.
  • Used a pulsed UV laser to ablate parylene, cut the platinum, and roughen the recording site.
  • See also {311}

hide / / print
ref: -0 tags: polyimide electrodes immune response foreign body inflammation stiffener steiglitz date: 12-28-2017 02:37 gmt revision:0 [head]

PMID-27534649 Intracortical polyimide electrodes with a bioresorbable coating.

  • Molten saccharose was used as coating material.
  • 270 x 10um polyimide recording probes. (large!)
  • Tissue reaction seems to peak at 2 weeks-4weeks, and decline somewhere thereafter. (though there were not a great number of samples.)

hide / / print
ref: -0 tags: rogers thermal oxide barrier neural implants ECoG coating accelerated lifetime test date: 12-28-2017 02:29 gmt revision:0 [head]

PMID-27791052 Ultrathin, transferred layers of thermally grown silicon dioxide as biofluid barriers for biointegrated flexible electronic systems

  • Thermal oxide proved the superior -- by far -- water barrier for encapsulation.
    • What about the edges?
  • Many of the polymer barrier layers look like inward-rectifiers:
  • Extensive simulations showing that the failure mode is from gradual dissolution of the SiO2 -> Si(OH)4.
    • Even then a 100nm layer is expected to last years.
    • Perhaps the same principle could be applied with barrier metals. Anodization or thermal oxidation to create a thick, nonporous passivation layer.
    • Should be possible with Al, Ta...

hide / / print
ref: -0 tags: Courtine e-dura PDMS silicone gold platinum composite stretch locomotion restoration rats date: 12-22-2017 01:59 gmt revision:0 [head]

PMID-25574019 Biomaterials. Electronic dura mater for long-term multimodal neural interfaces.

  • Fabrication:
    • 120um total PDMS thickness, made through soft lithography, covalent (O2 plasma) bonding between layers
    • 35nm of Au (thin!) deposited through a stencil mask.
    • 300um Pt-PDMS composite for electrode sites, deposited via screenprinting
  • 100 x 200um cross section drug delivery channel.
  • Compared vs. stiff 25um thick PI film electrode.
    • stiff implants showed motor impairments 1-2 weeks after implantation.
  • Showed remarkable recovery of supported locomotion with stimulation and drug infusion (to be followed by monkeys).

hide / / print
ref: -0 tags: Courtine PDMS soft biomaterials spinal cord e-dura date: 12-22-2017 01:29 gmt revision:0 [head]

Materials and technologies for soft implantable neuroprostheses

  • Quote: In humans, both the spinal cord and its meningeal protective membranes can experience as much as 10–20% tensile strain and
displacement (relative to the spinal canal) during normal postural movements. This motion corresponds to displacements on the order of centimetres17. The deformations relative to the spinal cord in animal models, such as rodents or non-human primates, are likely to be even larger.

hide / / print
ref: -2001 tags: polyimide Kipke bioactive flexible electrode arrays date: 12-22-2017 01:16 gmt revision:2 [1] [0] [head]

PMID-11327505 Flexible polyimide-based intracortical electrode arrays with bioactive capability.

  • Appears to be the first or one of the first use of thin-film polyimide for intracortical recording; will have to cite.
  • Fab protocol: 500nm release thermal oxide, photo-paternable PI, Cr-Au metalization, O2 plasma de-scum for adhesion (ish?), <20um total thickness.
  • Conductive epoxy attachment to connector.

hide / / print
ref: -0 tags: lieber mesh electronics SU-8 recording electrodes flexible polymer glass capillary date: 12-22-2017 00:14 gmt revision:0 [head]

PMID-29109247 Highly scalable multichannel mesh electronics for stable chronic brain electrophysiology

  • Key change was the addition of multiple conductor traces per longitudinal mesh line; this allows them to get 64 or 128 channels per mesh without a dramatic increase in modulus.
  • The latitudinal / diagonal lines still displace tissue ...
  • And the injection mechanism, glass pipette, 650um OD, 400um ID, is pretty large, even for 128 channels.
  • Use carbon nanotube ink, custom CNC printer, to connect to FPC.
    • Pretty impressive that they can manipulate ~800nm thick Su-8 film intraop and have it work well!

hide / / print
ref: -0 tags: computational biology evolution metabolic networks andreas wagner genotype phenotype network date: 06-12-2017 19:35 gmt revision:1 [0] [head]

Evolutionary Plasticity and Innovations in Complex Metabolic Reaction Networks

  • ‘’João F. Matias Rodrigues, Andreas Wagner ‘’
  • Our observations suggest that the robustness of the Escherichia coli metabolic network to mutations is typical of networks with the same phenotype.
  • We demonstrate that networks with the same phenotype form large sets that can be traversed through single mutations, and that single mutations of different genotypes with the same phenotype can yield very different novel phenotypes
  • Entirely computational study.
    • Examines what is possible given known metabolic building-blocks.
  • Methodology: collated a list of all metabolic reactions in E. Coli (726 reactions, excluding 205 transport reactions) out of 5870 possible reactions.
    • Then ran random-walk mutation experiments to see where the genotype + phenotype could move. Each point in the genotype had to be viable on either a rich (many carbon source) or minimal (glucose) growth medium.
    • Viability was determined by Flux-balance analysis (FBA).
      • In our work we use a set of biochemical precursors from E. coli 47-49 as the set of required compounds a network needs to synthesize, ‘’’by using linear programming to optimize the flux through a specific objective function’’’, in this case the reaction representing the production of biomass precursors we are able to know if a specific metabolic network is able to synthesize the precursors or not.
      • Used Coin-OR and Ilog to optimize the metabolic concentrations (I think?) per given network.
    • This included the ability to synthesize all required precursor biomolecules; see supplementary information.
    • ‘’’“Viable” is highly permissive -- non-zero biomolecule concentration using FBA and linear programming. ‘’’
    • Genomic distances = hamming distance between binary vectors, where 1 = enzyme / reaction possible; 0 = mutated off; 0 = identical genotype, 1 = completely different genotype.
  • Between pairs of viable genetic-metabolic networks, only a minority (30 - 40%) of reactions are essential,
    • Which naturally increases with increasing carbon source diversity:
    • When they go back an examine networks that can sustain life on any of (up to) 60 carbon sources, and again measure the distance from the original E. Coli genome, they find this added robustness does not significantly constrain network architecture.

Summary thoughts: This is a highly interesting study, insofar that the authors show substantial support for their hypotheses that phenotypes can be explored through random-walk non-lethal mutations of the genotype, and this is somewhat invariant to the source of carbon for known biochemical reactions. What gives me pause is the use of linear programming / optimization when setting the relative concentrations of biomolecules, and the permissive criteria for accepting these networks; real life (I would imagine) is far more constrained. Relative and absolute concentrations matter.

Still, the study does reflect some robustness. I suggest that a good control would be to ‘fuzz’ the list of available reactions based on statistical criteria, and see if the results still hold. Then, go back and make the reactions un-biological or less networked, and see if this destroys the measured degrees of robustness.

hide / / print
ref: -0 tags: photoacoustic tomography mouse imaging q-switched laser date: 05-11-2017 05:23 gmt revision:1 [0] [head]

Single-impulse panoramic photoacoustic computed tomography of small-animal whole-body dynamics at high spatiotemporal resolution

  • Used Q-switched Nd:YAG and Ti:Sapphire lasers to illuminate mice axially (from the top, through a diffuser and conical lens), exciting the photoacuostic effect, from which they were able to image at 125um resolution a full slice of the mouse.
    • I'm surprised at their mode of illumination -- how do they eliminate the out-of-plane photoacoustic effect?
  • Images look low contrast, but structures, e.g. cortical vasculature, are visible.
  • Can image at the rep rate of the laser (50 Hz), and thereby record cardiac and pulmonary rhythms.
  • Suggest that the photoacoustic effect can be used to image brain activity, but spatial and temporal resolution are limited.

hide / / print
ref: -0 tags: photoacoustic tomography mouse imaging q-switched laser date: 05-11-2017 05:21 gmt revision:0 [head]

Single-impulse panoramic photoacoustic computed tomography of small-animal whole-body dynamics at high spatiotemporal resolution

  • Used Q-switched Nd:YAG and Ti:Sapphire lasers to illuminate mice axially, exciting the photoacuostic effect, from which they were able to image at 125um resolution a full slice of the mouse.
  • Images look low contrast, but structures, e.g. cortical vasculature, are visible.
  • Can image at the rep rate of the laser (50 Hz), and thereby record cardiac and pulmonary rhythms.
  • Suggest that the photoacoustic effect can be used to image brain activity, but spatial and temporal resolution are limited.

hide / / print
ref: -0 tags: electrode area review impedance date: 04-28-2017 17:55 gmt revision:9 [8] [7] [6] [5] [4] [3] [head]

Quick review of electrode area / impedance within m8ta:

  • {895} 500um^2
  • {311} 490um^2 nominal; 900k
  • {1040} 108um^2, plated from 5M to 1M.
  • Neuronexus: 177, 413um, 700um, and 1250um.
    • Suggest 177um for SUA, 413um for MUA.
    • Community consensus seems to be that these electrodes don't last as long, though.
    • Electroplating 177um^2 sites with PEDOT:PSS reduces impedance to 23k {1388}
  • {823} 122um^2 nominal
  • {736} 500um^2
  • {1027} (Utah) 1600um^2
    • Impedance: ~ 220K +-91K (in vivo -- large variance)
    • Blackrock site lists impedance @ 400k
  • SIrOF Utah array -- 3100um^2 (3.1e-5cm^2) -- large!
    • Impedance ~50K according to [www.blackrockmicro.com/userfiles/file/Microelectrode%20Arrays.pdf Blackrock product brochure].
  • PMID-20124668 (Utah again) 2000um^2, 125k Pt, 6k SIROF.
  • Neuropixel: 144um^2 acid-etched TiN
  • Carbon fiber: ~38 um^2, PEDOT:PSS or PEDOT:pTS started ~ 4M, plated down to ~ 130k initial, went up to 2M pSS, 840k pTS.

hide / / print
ref: -2016 tags: Kozai carbon fiber microelectrodes JNE PEDOT PSS pTS date: 04-27-2017 01:42 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-27705958 Chronic in vivo stability assessment of carbon fiber microelectrode arrays.

  • showed excellent recording characteristics and nearly zero glial scarring.
  • 6.4um carbon fiber + 800nm parylene-C = 8.4um.
    • Cytec Thoronel T-650 CF, Youngs modulus = 255 GPa, tensile strength = 4.28 GPa, PAN-based.
  • Everything protected with our wonderful phenol epoxy 353NDT, heat-cure.
  • Used two coating solutions:
    • Solution of 0.01 M 3,4-ethylenedioxythiophene (483028, Sigma-Aldrich, St. Louis, MO): 0.1 M sodium p-toluenesulfonate (152536, Sigma-Aldrich, St. Louis, MO).
      • pTS is not that dissimilar from it's alkyl cousin, SPS, {1353}. Likely a soapy chemical due to the opposed methyl and sulfonic acid group; benzine will take up less room in the polymer c.f. SDS & may lower the oxidation potential of EDOT.
      • Tosylates have been explored as a EDOT counterion : PMID-22383043 Characterization of poly(3,4-ethylenedioxythiophene):tosylate conductive polymer microelectrodes for transmitter detection. and PEDOT-TMA
    • Solution was composed of 0.01 M 3,4-ethylene-dioxythiophene (483028, Sigma-Aldrich, St. Louis, MO):0.1 M polystyrene sulfonate (m.w. 70.000, 222271000, Acros, NJ).
    • For each solution the electrodeposition was carried out by applying 100 pA/channel for 600 s to form a layer of poly(3,4-ethylenedioxythiophene):sodium p-toluenesulfonate (PEDOT:pTS) or poly(3,4-ethylenedioxythiophene):polystyrene sulfonate (PEDOT:PSS).
      • Weird, would use voltage control here..
  • According to works by Green et al [45] and Hukins et al [46], equation (1) can be used to determine the aging time that
the fibers have undergone: t 37=t TQ10 T37)/10 t_{ 37} = t_T Q10^{T-37)/10} where t 37 t_{ 37} is the simulated aging time at 37 °C, t T t_T is the amount of real time that the samples have been kept at the elevated temperature, T T , and Q10 Q10 is an aging factor that is equal to 2, according to ASTM guidelines for polymer aging [47].
  • Show > 2MOhm impedance of the small-area electrodes. At the aging endpoint, PEDOT:pTS had about half the impedance of PEDOT:PSS.
    • 4M PSS, 7M pTS, both plated down to ~ 130k initial, went up to 2M pSS, 840k pTS.
  • Recording capability quite stellar
  • Likewise for the glial response.

hide / / print
ref: -0 tags: PEDOT PSS electroplate eletrodeposition neural recording michigan probe stimulation CSC date: 04-27-2017 01:36 gmt revision:1 [0] [head]

PMID-19543541 Poly(3,4-ethylenedioxythiophene) as a micro-neural interface material for electrostimulation

  • 23k on a 177um^2 site.
  • demonstrated in-vitro durable stimulation.
  • Electrodeposited with 6na for 900 seconds per electrode.
    • Which is high -- c.f. 100pA for 600 seconds {1356}
  • Greater CSC and lower impedance / phase than (comparable?) Ir or IrOx plating.

hide / / print
ref: -1977 tags: polyethylene surface treatment plasma electron irradiation mechanical testing saline seawater accelerated lifetime date: 04-15-2017 06:06 gmt revision:0 [head]

Enhancement of resistance of polyethylene to seawater-promoted degradation by surface modification

  • Polyethylene, when repeatedly stressed and exposed to seawater (e.g. ships' ropes), undergoes mechanical and chemical degradation.
  • Surface treatments of the polyethlyene can improve resistance to this degradation.
  • The author studied two methods of surface treatment:
    • Plasma (glow discharge, air) followed by diacid (adipic acid) or triisocyanate (DM100, = ?) co-polymerization
    • Electron irradiation with 500 kEV electrons.
  • Also mention CASING (crosslinking by activated species of inert gasses) as a popular method of surface treatment.
    • Diffuse-in crosslinkers is a third, popular these days ...
    • Others diffuse in at temperature e.g. a fatty acid - derived molecule, which is then bonded to e.g. heparin to reduce the thrombogenicity of a plastic.
  • Measured surface modifications via ATR IR (attenuated total reflectance, IR) and ESCA (aka XPS)
    • Expected results, carbonyl following the air glow discharge ...
  • Results:
    • Triisocyanate, ~ 6x improvement
    • diacid, ~ 50 x improvement.
    • electron irradiation, no apparent degradation!
      • Author's opinion that this is due to carbon-carbon crosslink leading to mechanical toughening (hmm, evidence?)
  • Quote: since the PE formulation studied here was low-weight, it was expected to lose crystallinity upon cyclic flexing; high density PE's have in fact been observed to become more crystalline with working.
    • Very interesting, kinda like copper. This could definitely be put to good use.
  • Low density polyethylene has greater chain branching and entanglement than high-density resins; when stressed the crystallites are diminished in total bulk, degrading tensile properties ... for high-density resins, mechanical working loosens up the structure enough to allow new crystallization to exceed stress-induced shrinkage of crystallites; hence, the crystallinity increases.

hide / / print
ref: -0 tags: tungsten eletropolishing hydroxide cleaning bath tartarate date: 03-28-2017 16:34 gmt revision:0 [head]

Method of electropolishing tungsten wire US 3287238 A

  • The bath is formed of 15% by weight sodium hydroxide, 30% by weight sodium potassium tartrate, and 55% by weight distilled water, with the bath temperature being between 70 and 100 F.
    • If the concentration of either the hydroxide or the tartrate is below the indicated minimum, the wire is electrocleaned rather than electropolished, and a matte finish is obtained rather than a specular surface.
    • If the concentration of either the hydroxide or the tartrate is greater than the indicated maximum, the electropolishing process is quite slow.
  • The voltage which is applied between the two electrodes 18 and 20 is from 16 to 18.5 volts, the current through the bath is 20 to 24 amperes, and the current density is 3,000 to 4,000 amperes per square foot of surface of wire in the bath.

hide / / print
ref: -0 tags: polyimide electrodes thermosonic bonding Stieglitz adhesion delamination date: 03-06-2017 21:58 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

IEEE-6347149 (pdf) Improved polyimide thin-film electrodes for neural implants 2012

  • Tested adhesion to Pt / SiC using accelerated aging in saline solution.
  • Targeted at retinal prostheses.
  • Layer stack:
    • 50nm SiC deposited through PECVD @ 100C using SPS, with low frequency RF modulation.
    • 100nm Pt
    • 100nm Au
    • 100nm Pt
      • These layers will alloy during cure, and hence reduce stress.
    • 30nm SiC
    • 10nm DLC (not needed, imho; PI sticks exceptionally well to clean SiC)
  • Recent studies have concluded that adhesion to PI is through carbon bindings and not through oxide formation.
    • Adhesion of polyimide to amorphous diamond-like carbon and SiC deteriorates at a minimal rate.
  • Delamination is caused by residual stress, which is not only inevetable but a major driving force for cracking in thin films.
    • Different CTE in layer stack -> different contraction when cooling from process temperature.
  • Platinum, which evaporates at 1770C, and is deposited ~100C (photoresists only withstand ~115C) results in a high-stress interface.
    • Pt - Carbon bonds only occur above 1000C
  • After 9 and 13 days of incubation the probes with 400 nm and 300nm of SiC, respectively, which were not tempered, showed complete delamination of the Pt from the SiC.
    • 60C, 0.9 M NaCl, 1 year.
    • The SiC remained attached to the PI.
      • Tempering: repeated treatment at 450C for 15 min in a N2 atmosphere.
    • All other probes remained stable.
  • Notably, used thermosonic bonding to the PI films, using sputtered (seed layer) then 12um electroplated Au.
  • Also: fully cured the base layer PI film.
  • Used oxygen plasma de-scum after patterning with resists to get better SiC adhesion to PI.
    • And better inter-layer adhesion (fully cured the first polyimide layer @ 450C).
  • Conclusion: "The fact that none of the tempered samples delaminated even after ~5 years of lifetime (extrapolated for 37 C) shows a tremendous increase in adhesion.

hide / / print
ref: Seymour-2007.09 tags: neural probe design recording Kipke Seymour parelene MEA histology PEDOT date: 02-23-2017 23:52 gmt revision:13 [12] [11] [10] [9] [8] [7] [head]

PMID-17517431[0] Neural probe design for reduced tissue encapsulation in CNS.

  • See conference proceedings too: PMID-17947102[1] Fabrication of polymer neural probes with sub-cellular features for reduced tissue encapsulation.
    • -- useful information.
  • They use SU8 - photoresist! - as a structural material. See also this.
    • They use silicon as a substrate for the fabrication, but ultimately remove it. Electrodes could be made of titanium, modulo low conductivity.
  • Did not / could not record from these devices. Only immunochemistry.
  • Polymer fibers smaller than 7um are basically invisible to the immune system. See [2]
  • Their peripheral recording site is 4 x 5um - but still not invisible to microglia. Perhaps this is because of residual insertion trauma, or movement trauma? They implanted the device flush with the cortical surface, so there should have been little cranial tethering.
  • Checked the animals 4 weeks after implantation.
  • Peripheral electrode site was better than shank location, but still not perfect. Well, any improvement is a good one...
  • No statistical difference between 4x5um lattice probes, 10x4um probes, 30x4um, and solid (100um) knife edge.
    • Think that this may be because of electrode micromotion -- the lateral edge sites are still relatively well connected to the thick, rigid shank.
  • Observed two classes of immune reactivity --
    • GFAP reactive hypertrophied astrocytes.
    • devoid of GFAP, neurofilament, and NEuN, but always OX-42 and often firbronectin and laminin positive as well.
    • Think that the second may be from meningeal cells pulled in with the stab wound.
  • Sensitivity is expected to increase with decreased surface area (but similar low impedance -- platinum black or oxidized iridium or PEDOT {1112} ).
  • Thoughts: it may be possible to put 'barbs' to relieve mechanical stress slightly after the probe location, preferably spikes that expand after implantation.
  • His thesis {1110}


[0] Seymour JP, Kipke DR, Neural probe design for reduced tissue encapsulation in CNS.Biomaterials 28:25, 3594-607 (2007 Sep)
[1] Seymour JP, Kipke DR, Fabrication of polymer neural probes with sub-cellular features for reduced tissue encapsulation.Conf Proc IEEE Eng Med Biol Soc 1no Issue 4606-9 (2006)
[2] Sanders JE, Stiles CE, Hayes CL, Tissue response to single-polymer fibers of varying diameters: evaluation of fibrous encapsulation and macrophage density.J Biomed Mater Res 52:1, 231-7 (2000 Oct)

hide / / print
ref: Schmidt-1993.11 tags: Normann utah array histology silicon electrode array cats date: 02-23-2017 22:03 gmt revision:4 [3] [2] [1] [0] [head]

PMID-8263001[0] Biocompatibility of silicon-based electrode arrays implanted in feline cortical tissue.

  • Tried two different times:
    • one day before euthanasia
    • 6 month implant.
  • Tried three different implants:
    • Uncoated silicon,
    • polymide coating
    • polymide coating with SiO2 adhesion layer / primer.
  • The last was the worst in terms of histopathological response.
  • Chronic implants showed relatively restrained immune response,
    • Gliosis was found around all tracks, 20-40um.
  • Encapsulation was less than 9um.
  • Edema and hemorrhage was minor but present on a subset of all implants.
  • Acute (24h) hemorrhage was more severe -- ~ 60%; edema ~ 20%.
  • Chronic histology revealed considerable macrophages w/ hemosiderin (a complex including ferritin)
  • See also [1]


[0] Schmidt S, Horch K, Normann R, Biocompatibility of silicon-based electrode arrays implanted in feline cortical tissue.J Biomed Mater Res 27:11, 1393-9 (1993 Nov)
[1] Jones KE, Campbell PK, Normann RA, A glass/silicon composite intracortical electrode array.Ann Biomed Eng 20:4, 423-37 (1992)

hide / / print
ref: -0 tags: carbon nanotube densification conductivity strength date: 02-23-2017 02:52 gmt revision:2 [1] [0] [head]

Super-strong and highly conductive carbon nanotube ribbons from post-treatment methods

  • Conductivity of 1.2e6 S/m, about that of stainless steel.
    • 500 x 500nm wire, length 1cm will have a resistance of 40k.
  • Aerogel method: methane + ferrocene + thiophene + hydrogen.
    • Resulting in ~ 18% Fe, multi-walled carbon nanotubes, diameter 15nm, 15-20 walls.
  • Densified with a stainless-steel spatula on regular paper.
    • Resulting in ribbons 22um wide, 650nm thick.
  • Very high tensile strength, up to 5.2 GPa; moduls ~ 266 GPa.

High-strength carbon nanotube fibre-like ribbon with high ductility and high electrical conductivity

  • Slightly higher conductivity, 1.82 - 2.24e6 S/m.
  • Rolled until it was 500nm thick!
  • Spun from an aerogel (!!) using ethanol + ferrocent + thiophene.

hide / / print
ref: -0 tags: iridium oxide nanotube intracellular recording electroplate MEA date: 02-22-2017 22:41 gmt revision:0 [head]

PMID-24487777 Iridium oxide nanotube electrodes for sensitive and prolonged intracellular measurement of action potentials.

  • Electrodeposition of IrOx "magically" forms 500nm tubes.
  • Holes in Si3N4 / SiO2 were formed via e-beam lithography; underlying Pt wires via liftoff.
  • Showed long (minutes) intracellular access, though it tended to dip with time.

hide / / print
ref: -0 tags: glassy carbon SU-8 pyrolysis CEC microelectrode stimulation stability platinum PEDOT date: 02-17-2017 00:05 gmt revision:2 [1] [0] [head]

A novel pattern transfer technique for mounting glassy carbon microelectrodes on polymeric flexible substrates

  • Use inert-atmosphere pyrolysis @ 900 - 1000 C of 20um SU-8 (which is aromatic) on a thermal oxide wafer.
  • Followed by spin & cure of PI.
  • Demonstrate strong carbonyl bonding of the glassy carbon with mechanical and FTIR testing.
  • Use of photosensitive PI allows through-vias to connect Cr/Au conductive traces.

PMID-28084398 Highly Stable Glassy Carbon Interfaces for Long-Term Neural Stimulation and Low-Noise Recording of Brain Activity

  • Use EIS to show superior charge-injection properties + stability of glassy carbon electrodes vs. Pt electrodes.
    • GC lasted > 5e6 pulses; Pt electrodes delaminated after 1e6 pulses.
    • Hydrogen bonding (above) clearly superior than neat PI-Pt interface
  • GC electrodes were, true to their name, glassy and much smoother than the platinum electrodes.
  • Further reduced impedance with PEDOT-PSS coating.
    • PEDOT-PSS coating on glassy carbon was, in their hands, far more stable than PEDOT-PSS on platinum.
  • All devices, GC, PEDOT:PSS, and Pt, had similar biocompatibility in their assay (figure 7)

hide / / print
ref: -0 tags: myoelectric EMG recording TMR prosthetics date: 02-13-2017 20:43 gmt revision:0 [head]

PMID: Man/machine interface based on the discharge timings of spinal motor neurons after targeted muscle reinnervation

  • General idea: deconvolve a grid-recorded EMG signal to infer the spinal motorneron spikes, and use this to more accurately decode user intention.
  • EMG envelope is still fairly good...

hide / / print
ref: -0 tags: polypyrrole date: 02-09-2017 01:26 gmt revision:0 [head]

PMID-23307738 Bio-inspired Polymer Composite Actuator and Generator Driven by Water Gradients

  • Alas, water gradient driven, not electrically driven. Still, highly interesting ... check out the videos.

hide / / print
ref: -0 tags: carbon fiber thread spinning Pasquali Kemere nanotube stimulation date: 02-09-2017 01:09 gmt revision:0 [head]

PMID-25803728 Neural stimulation and recording with bidirectional, soft carbon nanotube fiber microelectrodes.

  • Poulin et al. demonstrated that microelectrodes made solely of CNT fibers22 show remarkable electrochemical activity, sensitivity, and resistance to biofouling compared to conventional carbon fibers when used for bioanalyte detection in vitro.23-25
  • Fibers were insulated with 3 um of block copolymer polystyrene-polybutadiene (PS-b-PBD) (polybutadiene is sythetic rubber)
    • Selected for good properties of biocompatibility, flexibility, resistance to flextural fatigue.
    • Available from Sigma-Aldrich.
    • Custom continuous dip-coating process.
  • 18um diameter, 15 - 20 x lower impedance than equivalently size PtIr.
    • 2.5 - 6x lower than W.
    • In practice, 43um dia, 1450um^2, impedance of 11.2 k; 12.6um, 151k.
  • Charge storage capacity 327 mC / cm^2; PtIr = 1.2 mC/cm^2
  • Wide water window of -1.5V - 1.5V, consistent with noble electrochemical properties of C.
  • Lasts for over 97e6 pulsing cycles beyond the water window, vs 43e6 for PEDOT.
  • Tested via 6-OHDA model of PD disease vs. standard PtIr stimulating electrodes, implanted via 100um PI shuttled attached with PEG.
  • Yes, debatable...
  • Tested out to 3 weeks durability. Appear to function as well or better than metal electrodes.

PMID-23307737 Strong, light, multifunctional fibers of carbon nanotubes with ultrahigh conductivity.

  • Full process:
    1. Dissolve high-quality, 5um long CNT in chlorosulfonic acid (the only known solvent for CNTs)
    2. Filter to remove particles
    3. Extrude liquid crystal dope through a spinneret, 65 or 130um orifice
    4. Into a coagulant, acetone or water
    5. Onto a rotating drum to put tension on the thread & align the CNTs.
    6. Wash in water and dry at 115C.
  • Properties:
    • Tensile strength 1 GPa +- 0.2 GPa.
    • Tensile modulus 120 GPa +- 50, best value 200 GPa
      • Pt: 168 GPa ; Au: 79 GPa.
    • Elongation to break 1.4 %
    • Conductivity: 0.3 MS/m, Iodine doped 5 +- 0.5 MS/m (22 +- 4 microhm cm)
      • Cu: 59.6 MS/m ; Pt: 9.4 MS/m ; Au: 41 MS/m
      • Electrical conductivity drops after annealing @ 600C
      • But does not drop after kinking and repeated mechanical cycling.
  • Theoretical modulus of MWCNT ~ 350 GPa.
  • Fibers well-aligned at ~ 90% the density (measure 1.3 g/cc) of close-packed CNT.

hide / / print
ref: -0 tags: nanoprobe transmembrane intracellular thiol gold AFM juxtacellular date: 02-06-2017 23:45 gmt revision:3 [2] [1] [0] [head]

PMID-20212151 Fusion of biomimetic stealth probes into lipid bilayer cores

  • Used e-beam evaporation of Cr/Au/Cr 10/10/10 or 10/5/10 onto a Si AFM tip.
    • Approx 200nm diameter; 1800 lipid interaction at the circumference.
  • Exposed the Au in the sandwich via FIB
  • Functionalized the Au with butanethiol or dodecanthiol; former is mobile on the surface, latter is polycrystaline.
    • Butanethiol showed higher adhesion to the synthetic membranes
  • Measured the penetration force & displacement through synthetic multi-layer lipid bilayers.
    • These were made via a custom protocol with 1-stearoyl-2-oleoyl-sn-glycero-3-phosphocholine (SOPC) and cholesterol

PMID-21469728 '''Molecular Structure Influences the Stability of Membrane Penetrating Biointerfaces.

  • Surprisingly, hydrophobicity is found to be a secondary factor with monolayer crystallinity the major determinate of interface strength
  • Previous studies using ellipsometry and IR spectroscopy have shown that alkanethiol self-assembled monolayers display an abrupt transition from a fluid to a crystalline phase between hexanethiol and octanethiol.
    • This suggests the weakening of the membrane stealth probe interface is due to the crystallinity of the molecular surface with fluid, disordered monolayers promoting a high strength interface regime and rigid, crystalline SAMs forming weak interfaces.

hide / / print
ref: -0 tags: nanopore membrane nanostraws melosh surface adhesion intracellular date: 02-06-2017 23:34 gmt revision:0 [head]

PMID-22166016 Nanostraws for Direct Fluidic Intracellular Access

  1. Used track-etched polycarbonate membranes, which have controlled pore density & ID.
  2. Deposited alumina on the pores & external surfaces using ALD
  3. Then etched away the top alumina
  4. and finally used O2 RIE to etch away the polycarbonate.
  • Show that these nanopores have cytosolic access (via Fluor 488 - hydrazide membrane impermeant dye
  • Also used nanostraws to deliver Co+2 to quench GFP fluorescence.

PMID-24710350, Quantification of nanowire penetration into living cells.

  • We discover that penetration is a rare event: 7.1±2.7% of the nanostraws penetrate the cell to provide cytosolic access for an extended period for an average of 10.7±5.8 penetrations per cell.
  • Using time-resolved delivery, the kinetics of the first penetration event are shown to be adhesion dependent and coincident with recruitment of focal adhesion-associated proteins.
    • Hours for unmodified, 5 minutes for adhesion-promoting surface.
  • Chinese hamster oviary cells expressing GFP, Co+2 quenching, EDTA chelation.
  • To modulate cell adhesion, nanostraw substrates were incubated in 10 μg ml−1 fibronectin, a well-characterized cell adhesion molecule, in addition to the standard polyornithine coating.

hide / / print
ref: -0 tags: review neural recording penn state extensive biopolymers date: 02-06-2017 23:09 gmt revision:0 [head]

PMID-24677434 A Review of Organic and Inorganic Biomaterials for Neural Interfaces

  • Not necessarily insightful, but certainly exhaustive review of all the various problems and strategies for neural interfacing.
  • Some emphasis on graphene, conductive polymers, and biological surface treatments for reducing FBR.
  • Cites 467 articles!

hide / / print
ref: -0 tags: intracellular juxtacellular recording tungsten nanowire whole cell patch date: 02-06-2017 22:39 gmt revision:2 [1] [0] [head]

PMID-22905231 Neuronal recordings with solid-conductor intracellular nanoelectrodes (SCINEs).

  • <300 nm diameter W fibers, several um long, fabricated via FIB.
  • Functionalized with a hydrophobic silane on the oxide.
    • Quite complete & custom methods here.
  • Not quite whole cell recording, but excellent SNR; 4mv APs.
    • Slice, rat hippocampus organotypic.
    • Expected much larger recorded APs; suspect partial membrane penetration.
    • Only lasted a few seconds to minutes.
  • Needed custom recording setup for interfacing with 100Gohm electrodes; stray capacitance < 4 pf.
  • Intracellular electrodes must be designed to not shunt the membrane open upon insertion.
    • In a study where whole-cell recordings were established prior sharp microelectrode penetration, all neurons showed significant depolarization following impalement.
    • Here there was no change in membrane voltage in 10% of insertions of the silane-functionalized SCINEs. only in the functionalized electrodes).
    • Minor distortion of the AP was observed.
  • In whole-cell patch clamping, diffusion from the pipette to the cytosol interrupts biochemical processes necessary for normal cellular function (e.g. respiration!).
  • The hardness of the tungsten ensures that SCINEs can be repeatedly inserted millimeter-deep into brain tissue without noticeable damage to the tip.
    • E.g. 300 nm tungsten will not easily navigate vasculature...

hide / / print
ref: -0 tags: carbon fiber pitch based tensile strength date: 02-04-2017 00:07 gmt revision:4 [3] [2] [1] [0] [head]

Contenders for high-modulus pitch-based carbon fiber: "

CorpModelYoung's modulusTensile StrengthDiameter Elongation at break
Nippon Graphite Fiber CoGranoc XN-90860 GPa3.43 GPa10 um0.4%
Mitsubishi RayonK13D2U940 GPa3.21 GPa11 um0.36%
Cytec ThornelP-120830 GPa2.41 GPa??0.3-0.5%
Cytec ThornelK1100965 GPa3.10 GPa10 um??

Tensile and Flextural Prperties of single carbon fibers

  • High modulus pitch-based carbon fibers have quite low compressive and shear strengths. The flexural strength could be affected strongly by its low strength under compression and shear loading.

hide / / print
ref: -0 tags: bone marrow transplant chimera immune response to indwelling electrode implant capadona inflammation date: 02-02-2017 23:24 gmt revision:1 [0] [head]

PMID-24973296 The roles of blood-derived macrophages and resident microglia in the neuroinflammatory response to implanted intracortical microelectrodes.

  • Quite good introductory review on current understanding of immune / inflammatory / BBB breakdown response to indwelling neural implants.
  • Used chimera mice with marrow from CFP mice transplanted into irradiated hosts, so myeloid cells were labeled (including macrophages and monocytes).
    • Details of this process are properly fascinating ... there are clever ways of isolating and selecting the right marrow cells.
  • Implanted with a dummy Michigan style probe, 2mm x 123 um x 15um.
  • Histological processes and cell sorting / labeling also highly detailed.
  • 60% of the infiltrating cells (CFP+) are macrophages.
    • Within the total IBA1+ population (macrophages + microglia), we saw that only 20% of the total IBA1+ population was comprised of microglia at two weeks post implantation (Fig. 9G).
    • Additionally, at chronic time points (four, eight and sixteen weeks), we observed that less than 40% of the total IBA1+ population was comprised of microglia (Fig. 9G).
    • On the other hand, no significant differences were observed in microglia populations over time (Fig. 9G, Table 4). Together, our results suggest a predominant role of infiltrating macrophages surrounding implanted microelectrodes over time.
  • IBA1 = marker for ionized calcium binding adapter molecule, to label the total population of microglia/ macrophages (both resting and activated)
  • CD68 = activated microglia / macrophage.
    • Hard to discriminate microglia and infiltrating macrophages.
  • Interestingly, fluctuations in GFAP+ immunoreactivity correlated well with neuronal density and CFP+ immunoreactivty, suggesting a possible role of astrocytes in facilitating trafficking of blood-derived cells.
  • Contrary to what has been suggested by many intracortical microelectrode studies, a consistent connection was not found between activated microglia/macrophages and neuron density in our chimera models

hide / / print
ref: -0 tags: nanotube tracking extracellular space fluorescent date: 02-02-2017 22:13 gmt revision:0 [head]

PMID-27870840 Single-nanotube tracking reveals the nanoscale organization of the extracellular space in the live brain

  • Extracellular space (ECS) takes up nearly a quarter the volume of the brain (!!!)
  • Used the intrinsic fluorescence of single-walled carbon nanotubes @ 1um, 845nm excitation, with super-resolution tracking of diffusion.
    • Were coated in phospholipid-polyethylene glycol (PL-PEG), which display low cytotoxicity compared to other encapsulants.
  • 5ul, 3ug/ml injected into the ventricles of young rats; allowed to diffuse for 30 minutes post-injection.
  • No apparent response of the microglia.
  • Diffusion tracking revealed substantial dead-space domains in the ECS.
    • As compared to patch-clamp loaded SWCNTs
  • Estimate from parallel and perpendicular diffusion rates that the characteristic scale of ECS dimension is 80 to 270nm, or 150 +- 40nm.
  • The ECS nanoscale dimensions as visualized by tracking similar in dimension and tortuosity to electron microscopy.
  • Viscosity of the extracellular matrix from 1 to 50 mPa S, up to two orders of magnitude higher than the CSF.
  • Positive control through hyalurinase + several hours to digest the hyaluronic acid.
    • But no observed changes in morphology of the neurons via confocal .. interesting.
    • Enzyme digestion normalized the spatial heterogenaity of diffusion.

hide / / print
ref: -0 tags: juxtacellular recording gold mushroom cultured hippocampal neurons Spira date: 02-01-2017 02:44 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

Large-Scale Juxtacellular Recordings from Cultured Hippocampal Neurons by an Array of Gold-Mushroom Shaped Microelectrodes

  • Micrometer sized Au mushroom MEA electrodes.
  • Functionalized by poly-ethylene-imine (PEI, positively charged)/laminin (extracellular matrix protein) undergo a process to form juxtacellular junctions between the neurons and the gMµEs.
  • No figures, but:
    • Whereas substrate integrated planar MEA record FPs dominated by negative-peak or biphasic-signals with amplitudes typically ranging between 40-100 µV and a signal to noise ratio of ≤ 5,
    • The gMµE-MEA recordings were dominated by positive monophasic action potentials.
    • It is important to note that monophasic high peak amplitudes ≥ 100 µV are rarely obtained using planar electrodes arrays, whereas when using the gMµE-MEA, 34.48 % of the gMµEs recorded potentials ≥ 200 µV and 10.64 % recorded potentials in the range of 300-5,085 µV.
  • So, there is a distribution of coupling, approximately 10% "good".

PMID-27256971 Multisite electrophysiological recordings by self-assembled loose-patch-like junctions between cultured hippocampal neurons and mushroom-shaped microelectrodes.

  • Note 300uV - 1mV extracellular 'juxtacellular' action potentials from these mushroom recordings. This is 2 - 5x better than microwire extacellular in-vivo ephys; coupling is imperfect.
    • Sharp glass-insulated W electrodes, ~ 10Mohm, might achieve better SNR if driven carefully.
  • 2um mushroom cap Au electrodes, 1um diameter 1um long shaft
    • No coating, other than the rough one left by electroplating process.
    • Impedance 10 - 25 Mohm.
  • APs decline within a burst of up to 35% -- electrostatic reasons?
  • Most electrodes record more than one neuron, similar to in-vivo ephys, with less LFP coupling.

PMID-23380931 Multi-electrode array technologies for neuroscience and cardiology

  • The key to the multi-electrode-array ‘in-cell recording’ approach developed by us is the outcome of three converging cell biological principals:
    • (a) the activation of endocytotic-like mechanisms in which cultured Aplysia neurons are induced to actively engulf gold mushroom-shaped microelectrodes (gMμE) that protrude from a flat substrate,
    • (b) the generation of high Rseal between the cell’s membrane and the engulfed gMμE, and
    • (c) the increased junctional membrane conductance.
  • Functionalized the Au mushrooms with an RGD-based peptide
    • RGD is an extracellular matrix binding site on fibronectin, which mediates it's interaction with integrin, a cell surface receptor; it is thought that other elements of fibronectin regulate specificity with its receptor. PMID-2418980

hide / / print
ref: -0 tags: vertical nanowire juxtacellular recording date: 02-01-2017 00:50 gmt revision:2 [1] [0] [head]

PMID-22231664 Vertical nanowire electrode arrays as a scalable platform for intracellular interfacing to neuronal circuits.

  • Note actual coupling is low, 0.002, compared to patch-clamp (400uV vs 200mV). Signal is rather noisy.
  • Dissociated cultures of rat cortical neurons
  • Stimulation current 200 pa enough to change membrane potential, but not initiate a spike.
    • This is 200e-12 / 20e-6 = 5 orders of magnitude lower current than typical ICMS.

hide / / print
ref: -0 tags: microstimulation rat cortex measurement ICMS spread date: 01-26-2017 02:52 gmt revision:0 [head]

PMID-12878710 Spatiotemporal effects of microstimulation in rat neocortex: a parametric study using multielectrode recordings.

  • Measure using extracellular ephys a spread of ~ 1.3mm from near-threshold microstimulation.
  • Study seems thorough despite limited techniques.

hide / / print
ref: -0 tags: direct electrical stimulation neural mapping review date: 01-26-2017 02:28 gmt revision:0 [head]

PMID-22127300 Direct electrical stimulation of human cortex -- the gold standard for mapping brain functions?

  • Fairly straightforward review, shows the strengths and weaknesses / caveats of cortical surface stimulation.
  • Axon initial segment and nodes of Ranvier (which has a high concentration of Na channels) are the most excitable.
  • Stimulation of a site in the LGN of the thalamus increased the BOLD signal in the regions of V1 that received input from that site, but strongly suppressed it in the retinotopicaly matched regions of extrastriate cortex.
  • To test the hypothesis that the deactivation of extrastriate cortex might be due to synaptic inhibition of V1 projection neurons, GABA antagonists were microinjected into V1 in monkeys in experiments that combined fMRI, ephys, and microstim.
    • Ref 25. PMID-20818384
    • These findings suggest that the stimulation of cortical neurons disrupts the propagation of cortico-cortico signals after the first synapse.
    • Likely due to feedforward and recurrent inhibition.
  • Revisit the hypothesis of tight control of excitation and inhibition (e.g. in-vivo patch clamping + drugs). "The interactions between excitation and inhibition within cortical microcircuits as well as between inter-regional connections haper the predicability of stimulation."
  • The average size of a fMRI voxel:
    • 55ul, 55mm^2
    • 5.5e6 neurons,
    • 22 - 55e9 billion synapses,
    • 22km dendrites (??)
    • 220km axons.
  • In the 1970s, Daniel Pollen conducted a series of studies stimulating the visual cortex of cats and humans.
    • Observed long intra-stim responses, and post-stim afterdischarges.
    • Importantly, he also observed inhibitory effects of DES on cortical responses at the stimulation site.
      • The inhibitory effect depended on the state of the neuron before stimulation.
      • High spontaneous activity + low stim strengths = inhibition;
      • low spontaneous activity + high stim strengths = excitation.
  • In the author's opinion, there is an equal or greater number of inhibitory responses to electrical microstimulation as excitatory. Only, there is a reporting bias toward the positive.
  • Many locations for paresthesias:
    • postcentral sulcus (duh)
    • opercular area inferior postcentral gyrus (e.g. superior to and facing the temporal lobe)[60]
    • posterior cingulate gyrus
    • supramarginal gyrus
    • temporal lobe, limbic and isocortical structures.

hide / / print
ref: -0 tags: Kleinfeld vasculature cortex review ischemia perfusion date: 01-22-2017 19:40 gmt revision:3 [2] [1] [0] [head]

PMID-25705966 Robust and fragile aspects of cortical blood flow in relation to the underlying angioarchitecture.

  • "The penetrating arterioles that connect the pial network to the subsurface network are bottlenecks to flow; occlusion of even a single penetrating arteriole results in the death of a 500 μm diameter cylinder of cortical tissue despite the potential for collateral flow through microvessels."
  • The pioneering work of Fox and Raichle [7] suggest that there is simply not enough blood to go around if all areas of the cortex were activated at once.
  • There is strong if only partially understood coupling between neuronal and vascular dysfunction [15]. In particular, vascular disease leads to neurological decline and diminished cognition and memory [16].
  • A single microliter of cortex holds nearly one meter of total vasculature length wow! PMID-23749145
  • Subsurface micro vasculature (not arterioles or venules) is relatively robust to occlusion; figure 4.

hide / / print
ref: -0 tags: polyimide precursors date: 01-22-2017 06:03 gmt revision:1 [0] [head]


Dianiline / diamine:

hide / / print
ref: -0 tags: polyimide aqueous degradation kapton date: 01-22-2017 05:51 gmt revision:0 [head]

Aqueous degradation of polyimides

  • Above ph 2, Kapton (PMDA-ODA) test specimens decreased both tensile strength and elongation to break with water, with a rate that increased with temperature.
  • No samples completely degraded, however; tensile strength decreased by about 2x, and elongation from 30% to 5%.
  • The authors suspect that ortho (off-molecular axis) amide bonding, at about 0.6% of the total number of imide bonds, is responsible for this (otherwise the film would completely fall apart.)
  • Imide bonds themselves are robust to all but strong bases and acids.
  • See also {1253}.

hide / / print
ref: -0 tags: polyimide stieglitz stability date: 01-22-2017 05:35 gmt revision:1 [0] [head]

PMID-20144477 In vitro evaluation of the long-term stability of polyimide as a material for neural implants

  • PI degrades at 85C in PBS; otherwise, it's stable.
  • mechanical tests only; no electrical tests.
  • Durimide 7510 contains a photo-initiator and an adhesion promoter. Spin-coatable.
    • Adhesion can be inhibited with C4F8
    • notably softer.
  • Dupont Kapton is PMDA-ODA (phenol linkage in the amide); PI-2611 is BPDA-PPD (aromatic carbon-carbon in the dicarboxcylic acid). The latter resists water uptake better.

hide / / print
ref: -0 tags: graphene polyimide polymerization date: 01-22-2017 05:20 gmt revision:3 [2] [1] [0] [head]

Preparation and properties of graphene oxide/polyimide composite films with low dielectric constant and ultrahigh strength via in situpolymerization

  • The GO/PI composite films provide ultrahigh tensile strength (up to 844 MPa) and Young's modulus (20.5 GPa).
    • Almost 10x increase in tensile strength!
    • And even larger increase in modulus.
  • Also, you can reduce graphene / graphite oxide with an infrared laser: http://pubs.acs.org/doi/abs/10.1021/nn204200w

hide / / print
ref: Bartels-2008.09 tags: neurotrophic kennedy speech FM transmitter wireless Georga recording electrophysiology electrode date: 01-19-2017 02:18 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-18672003[0] Neurotrophic electrode: method of assembly and implantation into human motor speech cortex.

  • Glass electrode with 3-4 2mil Teflon insulated Au wires within it to record spiking.
  • Induce neurites (e.g. dendrites, axons, blood vessels, oligodendrocytes) to grow up into it using autologous sciatic nerve, and stay for the lifetime of the patient (Kennedy 1989) [1].
    • Histology has revealed axons, but not neurons, within the tissue inside the tip. (Kennedy 1989, 1992a.)
    • No glia in rat and monkey tests; PMID-1421115
    • Inserted 5-6mm into the cortex at an angle of 45 deg. far!?
  • Bipolar amplification on pairs of the Au wires.
  • patients damaged their electrodes due to spasms; same for monkeys, presumably. Seems the electronice and gold wires are also highly fragile. I'm quite familiar with this.
  • Includes a sine wave source for calibration. good idea!
  • Inductively powered @ 1Mhz.
  • FM modulation at 39.2Mz and 43.9Mhz. COTS?
    • The implantable electronics are bulky as can be seen in Figs. 14 and ​and 19. (what a mess?!)
  • 3 patients, 4 years in 2 patients that dies from unrelated causes, over 3 years in a third.
  • describe construction of electrode -- not complicated.


[0] Bartels J, Andreasen D, Ehirim P, Mao H, Seibert S, Wright EJ, Kennedy P, Neurotrophic electrode: method of assembly and implantation into human motor speech cortex.J Neurosci Methods 174:2, 168-76 (2008 Sep 30)
[1] Kennedy PR, The cone electrode: a long-term electrode that records from neurites grown onto its recording surface.J Neurosci Methods 29:3, 181-93 (1989 Sep)

hide / / print
ref: -0 tags: kennedy neurotropic electrode date: 01-19-2017 01:47 gmt revision:2 [1] [0] [head]

PMID-9237542 Activity of single action potentials in monkey motor cortex during long-term task learning. Kennedy PR1, Bakay RA.'''

  • 2mm glass cone electrode, filled with matrigel and nerve growth factor, was implanted into layer 5/6 of the monkey motor cortex.
    • Matrigel: a solubilized basement membrane preparation extracted from the Engelbreth-Holm-Swarm (EHS) mouse sarcoma, a tumor rich in such ECM proteins as laminin (a major component), collagen IV, heparin sulfate proteoglycans, entactin/nidogen, and a number of growth factors.
      • Used extensively in cell culture work.
      • Previous studies used 'autologous sciatic nerve'.
    • Of note, this was no less invasive than a Utah array; it's virtue lies in stability.
  • Incgrowing cells became mylenated [4]
  • Recording quality about the same as extracellular recordings: 40 to 80um amplitude.
    • Makes sense, as there was no reason for the neurites (no somas grew in!) to attach to the gold microwires.
  • Rather short communication describing what appears to be the idiosyncratic behavior of 3 neurons...

hide / / print
ref: -0 tags: serial electron microscopy Lichtman reconstruction nervous tissue date: 01-17-2017 23:32 gmt revision:0 [head]

PMID-26232230 Saturated Reconstruction of a Volume of Neocortex.

  • Data presented at Cell "Big Questions in Neuroscience", perhaps the most impressive of the talks.

hide / / print
ref: -0 tags: neural coding rats binary permutation retrosplenial basolateral amygdala tetrode date: 12-19-2016 07:39 gmt revision:1 [0] [head]

PMID-27895562 Brain Computation Is Organized via Power-of-Two-Based Permutation Logic.

  • Nice and interesting data, sort of kitchen sink of experiments but ...
  • At first blush it seems they have re-discovered Haar wavelets / the utility of binary decompositions.
  • Figures 9 and 10, however, suggest a discriminable difference in representation in layers 2/3 and 5/6, supporting their binary hypothesis.
    • The former targeted the mouse's large retrosplenial cortex; the latter, the hamster's prelimbic cortex.

hide / / print
ref: -0 tags: L1 cell adhesion neural implants microglia DRG spinal cord dorsal root inflammation date: 11-19-2016 22:55 gmt revision:1 [0] [head]

PMID-22750248 In vivo effects of L1 coating on inflammation and neuronal health at the electrode-tissue interface in rat spinal cord and dorsal root ganglion.

  • Kolarcik CL1, Bourbeau D, Azemi E, Rost E, Zhang L, Lagenaur CF, Weber DJ, Cui XT.
  • Quote: With L1, neurofilament staining was significantly increased while neuronal cell death decreased.
  • These results indicate that L1-modified electrodes may result in an improved chronic neural interface and will be evaluated in recording and stimulation studies.
  • Ok, so this CAM seems to mitigate against microglia / inflammation, but how was it selected vs any of the other CAMs and surface proteins? (This domain is almost completely unknown by me..)
  • Ultimate strategy likely to be a broad combination of mechanical (size, flexibility), biochemical (inflammation, cell migration), electrochamical (surface coatings) and vasculature-avoiding approaches.

hide / / print
ref: -0 tags: LCP polymer Zeus tensile modulus date: 11-11-2016 20:39 gmt revision:0 [head]


  • UTS 1.0 GPa; 80 MPa Youngs modulus.
  • No data on moisture uptake or molecular structure.

hide / / print
ref: -0 tags: china trustwothiness social engineering communism date: 10-31-2016 05:42 gmt revision:1 [0] [head]

China 'social credit': Beijing sets up huge system

So long as it purports to measure just one social variable -- 'trustworthiness' -- it might be a good idea. Many commerce websites (.. ebay ..) have these sort of rating systems already, and they are useful. When humans live in smaller communities something like this is in the shared consciousness.

Peering into everyone's purchasing habits and hobbies, however, seems like it will be grossly myopic and, as the article says, Orwellian. Likely they will train a deep-belief network on past data of weakly and communist party defined success, with all purchasing and social media as the input data, and use that in the proprietary algorithm for giving people their scalars to optimize. This would be the ultimate party control tool -- a great new handle for controlling people's minds, even 'better' than capitalism.

Surprising that the article only hints at this, and that the Chinese themselves seem rather clueless that it's a power play. In this sense, it's a very clever play to link it to reproduction.

Other comments:

These sorts of systems may be necessary in highly populated countries, where freedom and individuality are less valued and social cohesion is requisite.

hide / / print
ref: -0 tags: tungsten rhenium refactory metals book russia metalurgy date: 10-31-2016 05:14 gmt revision:1 [0] [head]

Physical Metallurgy of Refactory Metals and Alloys

Properties of tungsten-rhenium alloys

  • Luna metals suggests 3% Re improves the tensile strength of the alloy; Concept Alloys has 26% Re.
  • This paper mesured 20% Re, with a strength of 1.9 GPa; actual drawn tungsten wire has a strength of 3.3 GPa.
    • Drawing and cold working greatly affects metal, as always!

hide / / print
ref: -0 tags: PEDOT electropolymerization electroplating gold TFB borate counterion acetonitrile date: 10-18-2016 07:49 gmt revision:3 [2] [1] [0] [head]

Electrochemical and Optical Properties of the Poly(3,4-ethylenedioxythiophene) Film Electropolymerized in an Aqueous Sodium Dodecyl Sulfate and Lithium Tetrafluoroborate Medium

  • EDOT has a higher oxidation potential than water, which makes polymers electropolymerized from water "poorly defined".
  • Addition of SDS lowers the oxidation potential to 0.76V, below that of EDOT in acetonitrile at 1.1V.
  • " The potential was first switched from open circuit potential to 0.5 V for 100 s before polarizing the electrode to the desired potential. This initial step was to allow double-layer charging of the Au electrode|solution interface, which minimizes the distortion of the polymerization current transient by double-layer capacitance charging.17,18 "
    • Huh, interesting.
  • Plated at 0.82 - 0.84V, 0.03M EDOT conc.
  • 0.1M LiBF4 anion / electrolyte; 0.07M SDS sufactant.
    • This SDS is incorporated into the film, and affects redox reactions as shown in the cyclic voltammagram (fig 4)
      • Doping level 0.36
    • BF4-, in comparison, can be driven out of the film.

Improvement of the Electrosynthesis and Physicochemical Properties of Poly(3,4-ethylenedioxythiophene) Using a Sodium Dodecyl Sulfate Micellar Aqueous Medium

  • "The oxidation potential of thiopene = 1.8V; water = 1.23V.
  • Claim: "The polymer films prepared in micellar medium [SDS] are more stable than those obtained in organic solution as demonstrated by the fact that, when submitted to a great number of redox cycles (n ≈ 50), there is no significant loss of their electroactivity (<10%). These electrochemical properties are accompanied by color changes of the film which turns from blue-black to red-purple upon reduction."
  • Estimate that there is about 21% DS- anions in the PEDOT - SDS films.
    • Cl - was at ~ 7%.
  • I'm still not sure about incorporating soap into the electroplating solution.. !

Electrochemical Synthesis of Poly(3,4-ethylenedioxythiophene) on Steel Electrodes: Properties and Characterization

  • 0.01M EDOT and 0.1M LiClO4 in acetonitrile.
  • Claim excellent adhesion & film properties to 316 SS.
  • Oxidation / electrodeposition at 1.20V; voltages higher than 1.7V resulted in flaky films.

PMID-20715789 Investigation of near ohmic behavior for poly(3,4-ethylenedioxythiophene): a model consistent with systematic variations in polymerization conditions.

  • Again use acetonitrile.
  • 1.3V vs Ag/AgCl electrode.
  • Perchlorate and tetraflouroborate both seemed the best counterions (figure 4).
  • Figure 5: Film was difficult to remove from surface.
    • They did use a polycrystaline Au layer:
    • "The plating process was allowed to run for 1 min (until approximately 100 mC had passed) at a constant potential of 0.3 V versus Ag/AgCl in 50 mM HAuCl4 prepared in 0.1 M NaCl."
  • Claim that the counterions are trapped; not in agreement with the SDS study above.
  • "Conditions for the consistent production of conducting polymer films employing potentiostatic deposition at 1.3 V for 60-90 s have been determined. The optimal concentration of the monomer is 0.0125 M, and that of the counterion is 0.05 M. "

PMID-24576579 '''Improving the performance of poly(3,4-ethylenedioxythiophene) for brain–machine interface applications"

  • Show that TFB (BF4-) is a suitable counterion for EDOT electropolymerization.
  • Comparison is between PEDOT:TFB deposited in an anhydrous acetronitrile solution, and PEDOT:PSS deposited in an aqueous solution.
    • Presumably the PSS brings the EDOT into solution (??).
  • figure 3 is compelling, but long-term, electrodes are not that much better than Au!
    • Maybe we should just palate with that.

PEDOT-modified integrated microelectrodes for the detection of ascorbic acid, dopamine and uric acid

  • Direct comparison of acetonitrile and water solvents for electropolymerization of EDOT.
  • "PEDOT adhesion is best on gold surface due to the strong interactions between gold and sulphur atoms.
  • images/1353_2.pdf
    • Au plating is essential!

hide / / print
ref: -0 tags: David Kleinfeld penetrating arterioles perfusion cortex vasculature date: 10-17-2016 23:24 gmt revision:1 [0] [head]

PMID-17190804 Penetrating arterioles are a bottleneck in the perfusion of neocortex.

  • Focal photothrombosis was used to occlude single penetrating arterioles in rat parietal cortex, and the resultant changes in flow of red blood cells were measured with two-photon laser-scanning microscopy in individual subsurface microvessels that surround the occlusion.
  • We observed that the average flow of red blood cells nearly stalls adjacent to the occlusion and remains within 30% of its baseline value in vessels as far as 10 branch points downstream from the occlusion.
  • Preservation of average flow emerges 350 mum away; this length scale is consistent with the spatial distribution of penetrating arterioles
  • Rose bengal photosensitizer.
  • 2p laser scanning microscopy.
  • Downstream and connected arterioles show a dramatic reduction in blood flow, even 1-4 branches in; there is little reduncancy (figure 2)
  • Measured a good number of vessels (and look at their density!); results are satisfactorily quantitative.
  • Vessel leakiness extends up to 1.1mm away (!) (figure 5).

hide / / print
ref: -0 tags: gold micrograin recording electrodes electroplating impedance date: 10-17-2016 20:28 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-23071004 Gold nanograin microelectrodes for neuroelectronic interfaces.

  • We report a single-cell sized microelectrode, which has unique gold nanograin structures, using a simple electrochemical deposition method.
  • Fabricated microelectrode had a sunflower shape with 1-5 (um of micropetals along the circumference of the microelectrode and 500 nm nanograins at the center.
  • The nanograin electrodes had 69-fold decrease of impedance and 10-fold increase in electrical stimulation capability compared to unmodified flat gold microelectrodes.
  • images/1270_1.pdf pdf
  • The deposition was conducted with an aqueous solution containing 25 mM HAuCl (HAuCl · 3H O, Sigma-Aldrich, MO, 4 4 2USA) and 20 g/L polyvinylpyrrolidone (surfactant, stabilizing agent)

hide / / print
ref: -0 tags: Charles Lieber syringe-injectable electronics SU-8 chronic flexible date: 10-14-2016 23:30 gmt revision:1 [0] [head]

PMID-27571550 Stable long-term chronic brain mapping at the single-neuron level.

  • Fu TM, Hong G1, Zhou T1, Schuhmann TG, Viveros RD2, Lieber CM.
  • 8 months with only 800nm of Su-8 (400nm of insulation!!). This is both surprising and very impressive; we have to step up our game!
  • In a mouse, too - their surgical technique must be very good. Mice only live ~ 2 years anyway.
  • Figure 3 -- stability -- incredible.
  • Recording sites were bare platinum, 20um diameter; stimulation sites were also bare Pt, 150um dia.
    • No plating or mircowire-fets, so far as I can see; electrode impedances were stable at 200 - 600k (supplementary figure 12).

hide / / print
ref: -0 tags: Ciske Kalaska date: 10-11-2016 17:11 gmt revision:0 [head]

PMID-20345247 Neural mechanisms for interacting with a world full of choices

  • Cisek P1, Kalaska JF.
  • "Affordance competition hypothesis" -- idea is that there is no specialization of areas per se, but rather a distribution of behavior-related areas, ones which specialize in task and stimulus/motor relevance, rather than processing and representation.
    • Similar, but distinct, from Minsky's Society of Mind.
    • Broadly supported by experimental evidence, which shows little 'conventional' representation, but plenty of sensory and motor representation, nearly everywhere..
  • Based on Jean Piaget (1954), "who suggested that the abstract cognitive abilities of adult humans are constructed upon the basis of the sensorimotor interactions experienced as a child."
    • "This is supported by a variety of neural studies, which include the classic experiments of Held & Hein (1963), who found that the visual behavior of newborn kittens did not develop properly unless they were allowed to exert their own active control upon their visual input."
  • Good ideas, good reformulation, but needs to be fleshed out a bit more; need to review the citing literature.

hide / / print
ref: -0 tags: ultrasonic BMI monkey LFP intan nordic Ozturk UCSD date: 09-30-2016 19:38 gmt revision:2 [1] [0] [head]

A Wireless 32-Channel Implantable Bidirectional Brain Machine Interface

  • Yi Su 1,2,*, Sudhamayee Routhu 2, Kee S. Moon 3, Sung Q. Lee 4, WooSub Youm 4 and Yusuf Ozturk 2,
  • Only LFP from a utah array, but solid work none-the-less.
  • 20V unipolar stimulation.
    • Through separate recording and stimulation electrodes.
  • 35mm x 10mm.
  • LFP due to limited bandwidth.
    • Less RF bw & compression that the wireless system I designed 6 years ago.
    • Reason: "Further, in order to analyze the integrative synaptic processes, LFP is the signal of interest instead of spikes, because synaptic processes cannot be captured by spike activity of a small number of neurons"
captured by spike activity of a small number of neurons.
  • Reference use of DuraGen followed by silicone elastomer.
  • Didn't cite us.

hide / / print
ref: -0 tags: bone regrowth hyperelastic 3d print implant hydroxyapatite polycaptolactone date: 09-30-2016 18:27 gmt revision:0 [head]

Hyperelastic “bone”: A highly versatile, growth factor–free, osteoregenerative, scalable, and surgically friendly biomaterial

  • (From the abstract): hyperelastic “bone” is composed of 90 weight % (wt %) hydroxyapatite and 10 wt % polycaprolactone or poly(lactic-co-glycolic acid),
  • Can be rapidly three-dimensionally (3D) printed (up to 275 cm3/hour) from room temperature extruded liquid inks.
  • Mechanical properties: ~32 to 67% strain to failure, ~4 to 11 MPa elastic modulus & was highly absorbent (50% material porosity)
  • Supported cell viability and proliferation, and induced osteogenic differentiation of bone marrow–derived human mesenchymal stem cells cultured in vitro over 4 weeks without any osteo-inducing factors in the medium.
  • HB did not elicit a negative immune response, became vascularized, quickly integrated with surrounding tissues, and rapidly ossified and supported new bone growth without the need for added biological factors.

hide / / print
ref: -0 tags: David Kleinfeld cortical vasculature laser surgery network occlusion flow date: 09-23-2016 06:35 gmt revision:1 [0] [head]

Heller Lecture - Prof. David Kleinfeld

  • Also mentions the use of LIBS + q-switched laser for precisely drilling holes in the scull. Seems to work!
    • Use 20ns delay .. seems like there is still spectral broadening.
    • "Turn neuroscience into an industrial process, not an art form" After doing many surgeries, agreed!
  • Vasodiliation & vasoconstriction is very highly regulated; there is not enough blood to go around.
    • Vessels distant from a energetic / stimulated site will (net) constrict.
  • Vascular network is most entirely closed-loop, and not tree-like at all -- you can occlude one artery, or one capillary, and the network will route around the occlusion.
    • The density of the angio-architecture in the brain is unique in this.
  • Tested micro-occlusions by injecting rose bengal, which releases free radicals on light exposure (532nm, 0.5mw), causing coagulation.
  • "Blood flow on the surface arteriole network is insensitive to single occlusions"
  • Penetrating arterioles and venules are largely stubs -- single unbranching vessels, which again renders some immunity to blockage.
  • However! Occlusion of a penetrating arteriole retards flow within a 400 - 600um cylinder (larger than a cortical column!)
  • Occulsion of many penetrating vessels, unsurprisingly, leads to large swaths of dead cortex, "UBOS" in MRI parlance (unidentified bright objects).
  • Death and depolarizing depression can be effectively prevented by excitotoxicity inhibitors -- MK801 in the slides (NMDA blocker, systemically)

hide / / print
ref: -0 tags: laser induced breakdown spectroscopy for surgery tissue differentiation date: 09-22-2016 19:26 gmt revision:0 [head]

PMID-25426327 Laser induced breakdown spectroscopy for bone and cartilage differentiation - ex vivo study as a prospect for a laser surgery feedback mechanism.

  • Mehari F1, Rohde M2, Knipfer C2, Kanawade R1, Klämpfl F1, Adler W3, Stelzle F4, Schmidt M1.
  • Tested on pig ear cartilage & cortical bone.
  • 532nm, Q-switched, flashlamp-pumped Nd:YAG, 80mJ pulse energy, 10ns, 1Hz.
  • Commercial spectrogram; light collected with 50um fiber optic connector.
    • We could probably put this in line with the laser mirrors, probably..
  • Super clean results: see any of the figures.
    • AUC = 1.00 !!

hide / / print
ref: -0 tags: super resolution imaging PALM STORM fluorescence date: 09-21-2016 05:57 gmt revision:0 [head]

PMID-23900251 Parallel super-resolution imaging

  • Christopher J Rowlands, Elijah Y S Yew, and Peter T C So
  • Though this is a brief Nature intro article, I found it to be more usefully clear than the wikipedia articles on super-resolution techniques.
  • STORM and PALM seek to stochastically switch fluorophores between emission and dark states, and are parallel but stochastic; STED and RESOLFT use high-intensity donut beams to stimulate emission (STED) or photobleach (RESOLFT) fluorophores outside of an arbitrarily-small location.
    • All need gaussian-fitting to estimate emitter location from the point-spread function.
  • This article comments on a clever way of making 1e5 donuts for parallel (as opposed to rastered) STED / RESOLFT.
  • I doubt stetting up a STED microscope is at all easy; to get these resolutions, everything must be still to a few nm!

hide / / print
ref: -0 tags: nucleus accumbens caudate stimulation learning enhancement MIT date: 09-20-2016 23:51 gmt revision:1 [0] [head]

Temporally Coordinated Deep Brain Stimulation in the Dorsal and Ventral Striatum Synergistically Enhances Associative Learning

  • Monkeys had to learn to associate an image with one of 4 reward targets.
    • Fixation period, movement period, reward period -- more or less standard task.
    • Blocked trial structure with randomized associations + control novel images + control familiar images.
  • Timed stimulation:
    • Nucleus Accumbens during fixation period
      • Shell not core; non-hedonic in separate test.
    • Caudate (which part -- targeting?) during feedback on correct trials.
  • Performance on stimulated images improved in reaction time, learning rate, and ultimate % correct.
  • Small non-significant improvement in non-stimulated novel image.
  • Wonder how many stim protocols they had to try to get this correct?

hide / / print
ref: -0 tags: planned economy red plenty date: 08-08-2016 05:54 gmt revision:0 [head]


  • Quote: "That planning is not a viable alternative to capitalism (as opposed to a tool within it) should disturb even capitalism’s most ardent partisans. It means that their system faces no competition, nor even any plausible threat of competition."
    • And therefore not only cannot be improved, but must degrade with time. But see below.
  • Quote: What we can do is try to find the specific ways in which these powers we have conjured up are hurting us, and use them to check each other, or deflect them into better paths. Sometimes this will mean more use of market mechanisms, sometimes it will mean removing some goods and services from market allocation, either through public provision or through other institutional arrangements. Sometimes it will mean expanding the scope of democratic decision-making (for instance, into the insides of firms), and sometimes it will mean narrowing its scope (for instance, not allowing the demos to censor speech it finds objectionable). Sometimes it will mean leaving some tasks to experts, deferring to the internal norms of their professions, and sometimes it will mean recognizing claims of expertise to be mere assertions of authority, to be resisted or countered.
    • I like to think of this as a very unstable equilibrium: the only way to maintain function is to continuously expend energy to shore up and change the market, politics, and society in general; the specific regulatory solution has complexity commensurate with the complexity of the economy regulated, and it must adapt on the same scales that the market economy changes.
    • Perhaps to do this, it needs a self-reflective faculty, to know which parts of itself need changing; otherwise, you'd need to have a regulator regulating the regulator, and who is to prevent that from agglomerating power. Yet this too is an unstable equilibrium.

hide / / print
ref: -0 tags: NC state tap drill chart date: 08-02-2016 18:38 gmt revision:0 [head]


by way of: https://m.reddit.com/r/engineering/comments/4ry07t/does_anyone_have_a_stored_copy_of_this_tap_and/

hide / / print
ref: -0 tags: image registration optimization camera calibration sewing machine date: 07-15-2016 05:04 gmt revision:20 [19] [18] [17] [16] [15] [14] [head]

Recently I was tasked with converting from image coordinates to real world coordinates from stereoscopic cameras mounted to the end-effector of a robot. The end goal was to let the user (me!) click on points in the image, and have the robot record that position & ultimately move to it.

The overall strategy is to get a set of points in both image and RW coordinates, then fit some sort of model to the measured data. I began by printing out a grid of (hopefully evenly-spaced and perpendicular) lines via a laserprinter; spacing was ~1.1 mm. This grid was manually aligned to the axes of robot motion by moving the robot along one axis & checking that the lines did not jog.

The images were modeled as a grating with quadratic phase in u,vu,v texture coordinates:

p h(u,v)=sin((a hu/1000+b hv/1000+c h)v+d hu+e hv+f h)+0.97 p_h(u,v) = sin((a_h u/1000 + b_h v/1000 + c_h)v + d_h u + e_h v + f_h) + 0.97 (1)

p v(u,v)=sin((a vu/1000+b vv/1000+c v)u+d vu+e vv+f v)+0.97 p_v(u,v) = sin((a_v u/1000 + b_v v/1000 + c_v)u + d_v u + e_v v + f_v) + 0.97 (2)

I(u,v)=16p hp v/(2+16p h 2+16p v 2) I(u,v) = 16 p_h p_v / ( \sqrt{ 2 + 16 p_h^2 + 16 p_v^2}) (3)

The 1000 was used to make the parameter search distribution more spherical; c h,c vc_h,c_v were bias terms to seed the solver; 0.97 was a duty-cycle term fit by inspection to the image data; (3) is a modified sigmoid.

I I was then optimized over the parameters using a GPU-accelerated (CUDA) nonlinear stochastic optimization:

(a h,b h,d h,e h,f h|a v,b v,d v,e v,f v)=Argmin u v(I(u,v)Img(u,v)) 2 (a_h,b_h,d_h,e_h,f_h | a_v,b_v,d_v,e_v,f_v) = Argmin \sum_u \sum_v (I(u,v) - Img(u,v))^2 (4)

Optimization was carried out by drawing parameters from a normal distribution with a diagonal covariance matrix, set by inspection, and mean iteratively set to the best solution; horizontal and vertical optimization steps were separable and carried out independently. The equation (4) was sampled 18k times, and equation (3) 34 billion times per frame. Hence the need for GPU acceleration.

This yielded a set of 10 parameters (again, c hc_h and c vc_v were bias terms and kept constant) which modeled the data (e.g. grid lines) for each of the two cameras. This process was repeated every 0.1 mm from 0 - 20 mm height (z) from the target grid, resulting in a sampled function for each of the parameters, e.g. a h(z)a_h(z) . This required 13 trillion evaluations of equation (3).

Now, the task was to use this model to generate the forward and reverse transform from image to world coordinates; I approached this by generating a data set of the grid intersections in both image and world coordinates. To start this process, the known image origin u origin| z=0,v origin| z=0u_{origin}|_{z=0},v_{origin}|_{z=0} was used to find the corresponding roots of the periodic axillary functions p h,p vp_h,p_v :

3π2+2πn h=a huv/1000+b hv 2/1000+(c h+e h)v+d hu+f h \frac{3 \pi}{ 2} + 2 \pi n_h = a_h u v/1000 + b_h v^2/1000 + (c_h + e_h)v + d_h u + f_h (5)

3π2+2πn h=a vu 2/1000+b vuv/1000+(c v+d v)u+e vv+f v \frac{3 \pi}{ 2} + 2 \pi n_h = a_v u^2/1000 + b_v u v/1000 + (c_v + d_v)u + e_v v + f_v (6)

Or ..

n h=round((a huv/1000+b hv 2/1000+(c h+e h)v+d hu+f h3π2)/(2π) n_h = round( (a_h u v/1000 + b_h v^2/1000 + (c_h + e_h)v + d_h u + f_h - \frac{3 \pi}{ 2} ) / (2 \pi ) (7)

n v=round((a vu 2/1000+b vuv/1000+(c v+d v)u+e vv+f v3π2)/(2π) n_v = round( (a_v u^2/1000 + b_v u v/1000 + (c_v + d_v)u + e_v v + f_v - \frac{3 \pi}{ 2} ) / (2 \pi) (8)

From this, we get variables n h,origin| z=0andn v,origin| z=0n_{h,origin}|_{z=0} and n_{v,origin}|_{z=0} which are the offsets to align the sine functions p h,p vp_h,p_v with the physical origin. Now, the reverse (world to image) transform was needed, for which a two-stage newton scheme was used to solve equations (7) and (8) for u,vu,v . Note that this is an equation of phase, not image intensity -- otherwise this direct method would not work!

First, the equations were linearized with three steps of (9-11) to get in the right ballpark:

u 0=640,v 0=360 u_0 = 640, v_0 = 360

n h=n h,origin| z+[30..30],n v=n v,origin| z+[20..20] n_h = n_{h,origin}|_{z} + [-30 .. 30] , n_v = n_{v,origin}|_{z} + [-20 .. 20] (9)

B i=[3π2+2πn ha hu iv i/1000b hv i 2f h 3π2+2πn va vu i 2/1000b vu iv if v] B_i = {\left[ \begin{matrix} \frac{3 \pi}{ 2} + 2 \pi n_h - a_h u_i v_i / 1000 - b_h v_i^2 - f_h \\ \frac{3 \pi}{ 2} + 2 \pi n_v - a_v u_i^2 / 1000 - b_v u_i v_i - f_v \end{matrix} \right]} (10)

A i=[d h c h+e h c v+d v e v] A_i = {\left[ \begin{matrix} d_h && c_h + e_h \\ c_v + d_v && e_v \end{matrix} \right]} and

[u i+1 v i+1]=mldivide(A i,B i) {\left[ \begin{matrix} u_{i+1} \\ v_{i+1} \end{matrix} \right]} = mldivide(A_i,B_i) (11) where mldivide is the Matlab operator.

Then three steps with the full Jacobian were made to attain accuracy:

J i=[a hv i/1000+d h a hu i/1000+2b hv i/1000+c h+e h 2a vu i/1000+b vv i/1000+c v+d v b vu i/1000+e v] J_i = {\left[ \begin{matrix} a_h v_i / 1000 + d_h && a_h u_i / 1000 + 2 b_h v_i / 1000 + c_h + e_h \\ 2 a_v u_i / 1000 + b_v v_i / 1000 + c_v + d_v && b_v u_i / 1000 + e_v \end{matrix} \right]} (12)

K i=[a hu iv i/1000+b hv i 2/1000+(c h+e h)v i+d hu i+f h3π22πn h a vu i 2/1000+b vu iv i/1000+(c v+d v)u i+e vv+f v3π22πn v] K_i = {\left[ \begin{matrix} a_h u_i v_i/1000 + b_h v_i^2/1000 + (c_h+e_h) v_i + d_h u_i + f_h - \frac{3 \pi}{ 2} - 2 \pi n_h \\ a_v u_i^2/1000 + b_v u_i v_i/1000 + (c_v+d_v) u_i + e_v v + f_v - \frac{3 \pi}{ 2} - 2 \pi n_v \end{matrix} \right]} (13)

[u i+1 v i+1]=[u i v i]J i 1K i {\left[ \begin{matrix} u_{i+1} \\ v_{i+1} \end{matrix} \right]} = {\left[ \begin{matrix} u_i \\ v_i \end{matrix} \right]} - J^{-1}_i K_i (14)

Solutions (u,v)(u,v) were verified by plugging back into equations (7) and (8) & verifying n h,n vn_h, n_v were the same. Inconsistent solutions were discarded; solutions outside the image space [0,1280),[0,720)[0, 1280),[0, 720) were also discarded. The process (10) - (14) was repeated to tile the image space with gird intersections, as indicated in (9), and this was repeated for all zz in (0..0.1..20)(0 .. 0.1 .. 20) , resulting in a large (74k points) dataset of (u,v,n h,n v,z)(u,v,n_h,n_v,z) , which was converted to full real-world coordinates based on the measured spacing of the grid lines, (u,v,x,y,z)(u,v,x,y,z) . Between individual z steps, n h,originn v,originn_{h,origin} n_{v,origin} was re-estimated to minimize (for a current zz' ):

(u origin| z+0.1u origin| z+0.1) 2+(v origin| z+0.1+v origin| z) 2 (u_{origin}|_{z' + 0.1} - u_{origin}|_{z' + 0.1})^2 + (v_{origin}|_{z' + 0.1} + v_{origin}|_{z'})^2 (15)

with grid-search, and the method of equations (9-14). This was required as the stochastic method used to find original image model parameters was agnostic to phase, and so phase (via parameter f f_{-} ) could jump between individual zz measurements (the origin did not move much between successive measurements, hence (15) fixed the jumps.)

To this dataset, a model was fit:

[u v]=A[1 x y z x 2 y 2 z 2 w 2 xy xz yz xw yw zw] {\left[ \begin{matrix} u \\ v \end{matrix} \right]} = A {\left[ \begin{matrix} 1 && x && y && z && x'^2 && y'^2 && \prime z'^2 && w^2 && x' y' && x' z' && y' z' && x' w && y' w && z' w \end{matrix} \right]} (16)

Where x=x10x' = \frac{x}{ 10} , y=y10y' = \frac{y}{ 10} , z=z10z' = \frac{z}{ 10} , and w=2020zw = \frac{ 20}{20 - z} . ww was introduced as an axillary variable to assist in perspective mapping, ala computer graphics. Likewise, x,y,zx,y,z were scaled so the quadratic nonlinearity better matched the data.

The model (16) was fit using regular linear regression over all rows of the validated dataset. This resulted in a second set of coefficients AA for a model of world coordinates to image coordinates; again, the model was inverted using Newton's method (Jacobian omitted here!). These coefficients, one set per camera, were then integrated into the C++ program for displaying video, and the inverse mapping (using closed-form matrix inversion) was used to convert mouse clicks to real-world coordinates for robot motor control. Even with the relatively poor wide-FOV cameras employed, the method is accurate to ±50μm\pm 50\mu m , and precise to ±120μm \pm 120\mu m .

hide / / print
ref: Gradinaru-2009.04 tags: Deisseroth DBS STN optical stimulation 6-OHDA optogenetics date: 05-10-2016 23:48 gmt revision:8 [7] [6] [5] [4] [3] [2] [head]

PMID-19299587[0] Optical Deconstruction of Parkinsonian Neural Circuitry.

  • Viviana Gradinaru, Murtaza Mogri, Kimberly R. Thompson, Jaimie M. Henderson, Karl Deisseroth
  • DA depletion of the SN leads to abnormal activity in the BG ; HFS (>90Hz) of the STN has been found to be therapeutic, but the mechanism is imperfectly understood.
    • lesions of the BG can also be therapeutic.
  • Used chanelrhodopsin (light activated cation channel (+)) which are expressed by cell type specific promoters. (transgenic animals). Also used halorhodopsins, which are light activated chloride pumps (inhibition).
    • optogenetics allows simultaneous optical stimulation and electrical recording without artifact.
  • Made PD rats by 6-hydroxydopamine unilaterally into the medial forebrain bundle of rats.
  • Then they injected eNpHr (inhibitory) opsin vector targeting excitatory neurons (under control of the CaMKIIa receptor) to the STN as identified stereotaxically & by firing pattern.
    • Electrical stimulation of this area alleviated rotational behavior (they were hemiparkinsonian rats), but not optical inhibition of STN.
  • Alternately, the glia in STN may be secreting molecules that modulate local circuit activity; it has been shown that glial-derived factor adenosine accumulates during DBS & seems to help with attenuation of tremor.
    • Tested this by activating glia with ChR2, which can pass small Ca+2 currents.
    • This worked: blue light halted firing in the STN; but, again, no behavioral trace of the silencing was found.
  • PD is characterized by pathological levels of beta oscillations in the BG, and synchronizing STN with the BG at gamma frequencies may ameliorate PD symptoms; while sync. at beta will worsen -- see [1][2]
  • Therefore, they tried excitatory optical stimulation of excitatory STN neurons at the high frequencies used in DBS (90-130Hz).
    • HFS to STN failed, again, to produce any therapeutic effect!
  • Next expressed channel rhodopsin in only projection neurons Thy1::ChR2 (not excitatory cells in STN), again did optotrode (optical stim, eletrical record) recordings.
    • HFS of afferent fibers to STN shut down most of the local circuitry there, with some residual low-amplitude high frequency burstiness.
    • Observed marked effects with this treatment! Afferent HFS alleviated Parkinsonian symptoms, profoundly, with immediate reversal once the laser was turned off.
    • LFS worsened PD symptoms, in accord with electrical stimulation.
    • The Thy1::ChR2 only affected excitatory projections; GABAergic projections from GPe were absent. Dopamine projections from SNr were not affected by the virus either. However, M1 layer V projection neurons were strongly labeled by the retrovirus.
      • M1 layer V neurons could be antidromically recruited by optical stimulation in the STN.
  • Selective M1 layer V HFS also alleviated PD symptoms ; LFS had no effect; M2 (Pmd/Pmv?) LFS causes motor behavior.
  • Remind us that DBS can treat tremor, rigidity, and bradykinesia, but is ineffective at treating speech impairment, depression, and dementia.
  • Suggest that axon tract modulation could be a common theme in DBS (all the different types..), as activity in white matter represents the activity of larger regions compactly.
  • The result that the excitatory fibers of projections, mainly from the motor cortex, matter most in producing therapeutic effects of DBS is counterintuitive but important.
    • What do these neurons do normally, anyway? give a 'copy' of an action plan to the STN? What is their role in M1 / the BG? They should test with normal mice.


[0] Gradinaru V, Mogri M, Thompson KR, Henderson JM, Deisseroth K, Optical Deconstruction of Parkinsonian Neural Circuitry.Science no Volume no Issue no Pages (2009 Mar 19)
[1] Eusebio A, Brown P, Synchronisation in the beta frequency-band - The bad boy of parkinsonism or an innocent bystander?Exp Neurol no Volume no Issue no Pages (2009 Feb 20)
[2] Wingeier B, Tcheng T, Koop MM, Hill BC, Heit G, Bronte-Stewart HM, Intra-operative STN DBS attenuates the prominent beta rhythm in the STN in Parkinson's disease.Exp Neurol 197:1, 244-51 (2006 Jan)

hide / / print
ref: -2016 tags: 6-OHDA parkinsons model warren grill simulation date: 05-10-2016 23:30 gmt revision:4 [3] [2] [1] [0] [head]

PMID-26867734 A biophysical model of the cortex-basal ganglia-thalamus network in the 6-OHDA lesioned rat model of Parkinson’s disease

  • Kumaravelu K1, Brocker DT1, Grill WM
  • Background: Although animal models (6-OHDA rats, MPTP mk) are rendered parkinsonian by a common mechanism (loss of dopaminergic neurons), there is considerable variation in the neuronal activity underlying the pathophysiology, including differences in firing rates, firing patterns, responses to cortical stimulation, and neuronal synchronization across different basal ganglia (BG) structures (Kita and Kita 2011;Nambu et al. 2000).
    • Yep. Highly idiopathic disease.
    • Claim there are good models of the MPTP monkey:
      • PMID-20309620 Modeling shifts in the rate and pattern of subthalamopallidal network activity during deep brain stimulation.
      • PMID-22805068 Network effects of subthalamic deep brain stimulation drive a unique mixture of responses in basal ganglia output.
  • Biophysical model of the cortex - basal ganglia - thalamus circuit
    • Hodgkin-Huxley type.
      • Single compartment neurons.
    • Validated by comparing responses of the BG to CTX stimulation.
    • Details, should they be important:
      • Each rCortex (regularly spiking) neuron
        • excitatory input from one TH neuron
        • inhibitory input from four randomly selected iCortex neurons.
        • Izhikevich model.
      • Each iCortex (fast inhibitory) neuron
        • excitatory input from four randomly selected rCortex neurons.
      • Each dStr (direct, D1/D5, ex) neuron
        • excitatory input from one rCortex neuron
        • inhibitory axonal collaterals from three randomly selected dStr neurons.
      • Each idStr (indirect, D2, inhib) neuron
        • excitatory input from one rCortex neuron
        • inhibitory axonal collaterals from four randomly selected idStr neurons.
      • Each STN neuron
        • inhibitory input from two GPe neurons
        • excitatory input from two rCortex neurons.
        • DBS modeled as a somatic current.
      • Each GPe neuron
        • inhibitory axonal collaterals from any two other GPe neurons
        • inhibitory input from all idStr neurons.
      • Each GPi neuron
        • inhibitory input from two GPe neurons
        • inhibitory input from all dStr neurons.
      • Some GPe/GPi neurons receive
        • excitatory input from two STN neurons,
        • while others do not.
      • Each TH neuron receives inhibitory input from one GPi neuron.
  • Diseased state:
    • Loss of striatal dopamine is accompanied by an increase in acetylcholine levels (Ach) in the Str (Ikarashi et al. 1997)
      • This results in a reduction of M-type potassium current in both the direct and indirect MSNs. (2.6 -> 1.5)
    • Dopamine loss results in reduced sensitivity of direct Str MSN to cortical stimulation (Mallet et al. 2006)
      • corticostriatal synaptic conductance from 0.07 to 0.026
    • Striatal dopamine depletion causes an increase in the synaptic strength of intra-GPe axonal collaterals resulting in aberrant GPe firing (Miguelez et al. 2012)
      • Increase from 0.125 to 0.5.
  • Good match to experimental rats:
  • Ok, so this is a complicated model (they aim to be the most complete to-date). How sensitive is it to parameter perturbations?
    • Noticeable ~20 Hz oscillations in BG in PD condition
    • ~9 Hz in STN & GPi.
  • And how well do the firing rates match experiment?
    • Not very. Look at the error bars.
  • What does DBS (direct current injection into STN neurons) do?
    • Se d,e,f: stochastic parameter; g,h,i: (semi) stochastic wiring.
  • Another check: NMDA antagonist into STN suppressed STN beta band oscillations in 6-OHDA lesioned rats (Pan et al. 2014).
    • Analysis of model GPi neurons revealed that episodes of beta band oscillatory activity interrupted alpha oscillatory activity in the PD state (Fig. 9a, b), consistent with experimental evidence that episodes of tremor-related oscillations desynchronized beta activity in PD patients (Levy et al. 2002).
  • What does DBS, at variable frequencies, do oscillations in the circuit?
  • How might this underly a mechanism of action?

Overall, not a bad paper. Not very well organized, which is not assisted by the large amount of information presented, but having slogged through the figures, I'm somewhat convinced that the model is good. This despite my general reservations of these models: the true validation would be to have it generate actual behavior (and learning)!

Lacking this, the approximations employed seem like a step forward in understanding how PD and DBS work. The results and discussion are consistent with {1255}, but not {711}, which found that STN projections from M1 (not the modulation of M1 projections to GPi, via efferents from STN) truly matter.

hide / / print
ref: -2012 tags: Emo Todorov contact invariant animation optimization complex motor behavior date: 05-04-2016 17:34 gmt revision:3 [2] [1] [0] [head]

* Watch the [http://homes.cs.washington.edu/~todorov/index.php?video=MordatchSIGGRAPH12&paper=Mordatch,%20SIGGRAPH%202012 movies! Discovery of complex behaviors through contact-invariant optimization]

  • Complex movements tend to have phases within which the set of active contacts (hands, feet) remains invariant (hence can exert forces on the objects they are contacting, or vice versa).
  • Discovering suitable contact sets is the central goal of optimization in our approach.
    • Once this is done, optimizing the remaining aspects of the movement tends to be relatively straightforward.
    • They do this through axillary scalar variables which indicate whether the a contact is active or not, hence whether to enable contact forces.
      • Allows the optimizer to 'realize' that movements should have phases.
      • Also "shapes the energy landscape to be smoother and better behaved"
  • Initial attempts to make these contact axillary variables discrete -- when and where -- which was easy for humans to specify, but made optimization intractable.
    • Motion between contacts was modeled as a continuous feedback system.
  • Instead, the contact variables c ic_i have to be continuous.
  • Contact forces are active only when c ic_i is 'large'.
    • Hence all potential contacts have to be enumerated in advance.
  • Then, parameterize the end effector (position) and use inverse kinematics to figure out joint angles.
  • Optimization:
    • Break the movement up into a predefined number of phases, equal duration.
    • Interpolate end-effector with splines
    • Physics constraints are 'soft' -- helps the optimizer : 'powerful continuation methods'
      • That is, weight different terms differently in phases of the optimization process.
      • Likewise, appendages are allowed to stretch and intersect, with a smooth cost.
    • Contact-invariant cost penalizes distortion and slip (difference between endpoint and surface, measured normal, and velocity relative to contact point)
      • Contact point is also 'soft' and smooth via distance-normalized weighting.
    • All contact forces are merged into a f 6f \in \mathbb{R}^6 vector, which includes both forces and torques. Hence contact force origin can move within the contact patch, which again makes the optimization smoother.
    • Set τ(q,q˙,q¨)=J(q) Tf+Bu\tau(q, \dot{q}, \ddot{q}) = J(q)^T f + B u where J(q) T J(q)^T maps generalize (endpoint) velocities to contact-point velocities, and f above are the contact-forces. BB is to map control forces uu to the full space.
    • τ(q,q˙,q¨)=M(q)q˙+C(q,q˙)q˙+G(q)\tau(q, \dot{q}, \ddot{q}) = M(q)\dot{q} + C(q, \dot{q})\dot{q} + G(q) -- M is inertia matrix, C is Coriolis matrix, g is gravity.
      • This means: forces need to add to zero. (friction ff + control uu = inertia + coriolis + gravity)
    • Hence need to optimize ff and uu .
      • Use friction-cone approximation for non-grab (feet) contact forces.
    • These are optimized within a quadratic programming framework.
      • LBFGS algo.
      • Squared terms for friction and control, squared penalization for penetrating and slipping on a surface.
    • Phases of optimization (continuation method):
      • L(s)=L CI(s)+L physics(s)+L task(s)+L hint(s)L(s) = L_{CI}(s) + L_{physics}(s) + L_{task}(s) + L_{hint}(s)
      • task term only: wishful thinking.
      • all 4 terms, physcics lessened -- gradually add constraints.
      • all terms, no hint, full physics.
  • Total time to simulate 2-10 minutes per clip (only!)
  • The equations of the paper seem incomplete -- not clear how QP eq fits in with the L(s)L(s) , and how c ic_i fits in with J(q) Tf+BuJ(q)^T f + B u

hide / / print
ref: -0 tags: ZeroMQ messaging sockets multithreading date: 05-03-2016 06:10 gmt revision:0 [head]

ZeroMQ -- much better sockets framework than native TCP/UDP sockets.

  • Bindings for Ocaml, too.
  • Supports Erlang-like concurrency.

hide / / print
ref: -0 tags: molecule mean free path vacuum date: 05-01-2016 03:16 gmt revision:0 [head]

Useful numbers for estimating molecular mean-free-path in vacuum systems:


PressureTorrMean free path
0.01 Pa7.5e-5 torr4.8 m
10 Pa75 mTorr4.8 mm
30 Pa225 mTorr1.6 mm

hide / / print
ref: -0 tags: google glucose sensing contact lens date: 04-28-2016 19:41 gmt revision:2 [1] [0] [head]

A contact lens with embedded sensor for monitoring tear glucose level

  • PMID-21257302
  • Metal stack: Ti 10nM / Pd 10nM / Pt 100nm.
  • on a 100um thick PET film.
  • A 30 µL glucose oxidase solution (10 mg/mL) was dropped onto the sensor area.
  • Then the sensor was suspended vertically above a titanium isopropoxide solution in a sealed dish for 6 h to create a GOD/titania sol-gel membrane, just as reported (Yu et al., 2003).
  • After forming the sol-gel membrane, a 30 µL aliquot of Nafion® solution was dropped onto the same area of the sensor, and allowed to dry in air for about 20 min.
  • Yet, the interference rejection from Nafion is imperfect; at 100uM concentrations, glucose is indistinguishable from ascorbic acid + lactate + urea.
  • Sensor drifts to 55% original performance after 4 days: figure 6
    • sensor was stored in a buffer @ 4C.
    • Probably OK for contact lenses, though.

hide / / print
ref: -0 tags: concentation of monoamine dopamine serotonin and norepinephrine in the brain date: 04-28-2016 19:38 gmt revision:3 [2] [1] [0] [head]

What are the concentrations of the monoamines in the brain? (Purpose: estimate the required electrochemical sensing area & efficiency)

  • Dopamine: 100 uM - 1 mM local, extracellular.
    • PMID-17709119 The Yin and Yang of dopamine release: a new perspective.
  • Serotonin (5-HT): 100 ng/g, 0.5 uM, whole brain (not extracellular!).
  • Norepinephrine / noradrenaline: 400 nm/g, 2.4 uM, again whole brain.
    • PMID-11744005 An enriched environment increases noradrenaline concentration in the mouse brain.
    • Also has whole-brain extracts for DA and 5HT, roughly:
      • 1200 ng/g DA
      • 400 ng/g NE
      • 350 ng/g 5-HT
  • So, one could imagine ~100 uM transient concentrations for all 3 monoamines.

hide / / print
ref: -0 tags: micro LEDS Buzaki silicon neural probes optogenetics date: 04-18-2016 18:00 gmt revision:0 [head]

PMID-26627311 Monolithically Integrated μLEDs on Silicon Neural Probes for High-Resolution Optogenetic Studies in Behaving Animals.

  • 12 uLEDs and 32 rec sites integrated into one probe.
  • InGaN monolithically integrated LEDs.
    • Si has ~ 5x higher thermal conductivity than sapphire, allowing better heat dissipation.
    • Use quantum-well epitaxial layers, 460nm emission, 5nm Ni / 5nm Au current injection w/ 75% transmittance @ design wavelength.
      • Think the n/p GaN epitaxy is done by an outside company, NOVAGAN.
    • Efficiency near 80% -- small LEDs have fewer defects!
    • SiO2 + ALD Al2O3 passivation.
    • 70um wide, 30um thick shanks.

hide / / print
ref: -0 tags: deep reinforcement learning date: 04-12-2016 17:19 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Prioritized experience replay

  • In general, experience replay can reduce the amount of experience required to learn, and replace it with more computation and more memory – which are often cheaper resources than the RL agent’s interactions with its environment.
  • Transitions (between states) may be more or less
    • surprising (does the system in question have a model of the environment? It does have a model of the state & action expected reward, as it's Q-learning.
    • redundant, or
    • task-relevant
  • Some sundry neuroscience links:
    • Sequences associated with rewards appear to be replayed more frequently (Atherton et al., 2015; Ólafsdóttir et al., 2015; Foster & Wilson, 2006). Experiences with high magnitude TD error also appear to be replayed more often (Singer & Frank, 2009 PMID-20064396 ; McNamara et al., 2014).
  • Pose a useful example where the task is to learn (effectively) a random series of bits -- 'Blind Cliffwalk'. By choosing the replayed experiences properly (via an oracle), you can get an exponential speedup in learning.
  • Prioritized replay introduces bias because it changes [the sampled state-action] distribution in an uncontrolled fashion, and therefore changes the solution that the estimates will converge to (even if the policy and state distribution are fixed). We can correct this bias by using importance-sampling (IS) weights.
    • These weights are the inverse of the priority weights, but don't matter so much at the beginning, when things are more stochastic; they anneal the controlling exponent.
  • There are two ways of selecting (weighting) the priority weights:
    • Direct, proportional to the TD-error encountered when visiting a sequence.
    • Ranked, where errors and sequences are stored in a data structure ordered based on error and sampled 1/rank\propto 1 / rank .
  • Somewhat illuminating is how the deep TD or Q learning is unable to even scratch the surface of Tetris or Montezuma's Revenge.

hide / / print
ref: -0 tags: delete date: 03-26-2016 07:42 gmt revision:1 [0] [head]

select * from sys.tables

hide / / print
ref: -0 tags: meta compilation self-hostying ACM date: 12-30-2015 07:52 gmt revision:2 [1] [0] [head]

META II: Digital Vellum in the Digital Scriptorium: Revisiting Schorre's 1962 compiler-compiler

  • Provides high-level commentary about re-implementing the META-II self-reproducing compiler, using Python as a backend, and mountain climbing as an analogy. Good read.
  • Original paper
  • What it means to be self-reproducing: The original compiler was written in assembly (in this case, a bytecode assembly). When this compiler is run and fed the language description (figure 5 in the paper), it outputs bytecode which is identical (or almost nearly so) to the hand-coded compiler. When this automatically-generated compiler is run and fed the language description (again!) it reproduces itself (same bytecode) perfectly.
    • See section "How the Meta II compiler was written"

hide / / print
ref: Linsmeier-2011.01 tags: histology lund electrodes immune response fine flexible review Thelin date: 12-08-2015 23:57 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-21867803[0] Can histology solve the riddle of the nonfunctioning electrode? Factors influencing the biocompatibility of brain machine interfaces.

  • We show results from an ultrathin multichannel wire electrode that has been implanted in the rat cerebral cortex for 1 year.
    • 12um Pt-Ir wires in a 200um bundle coated with gelatin. See PMID-20551508[1]
    • Electrode was left in the rat cortex for 354 days
    • no clear GFAP staining or ED1 positive cells at the electrode tips.
  • To improve biocompatibility of implanted electrodes, we would like to suggest that free-floating, very small, flexible, and, in time, wireless electrodes would elicit a diminished cell encapsulation.
  • Suggest standardized methods for the electrode design, the electrode implantation method, and the analyses of cell reactions after implantation
  • somewhat of a review -- Stice, Biran 2005 [2] 2007 [3].
  • 50um is the recording distance Purcell 2009.
  • See also [4]
  • Study of neuronal density and ED1 reactivity / GFAP:
    • Even at 12 weeks the correlation between NeuN density and GFAP / ED1 was small -- r 2=0.12r^2 = 0.12
    • Note that DAPI labels many unknown cells in the vicinity of the electrode.


[0] Linsmeier CE, Thelin J, Danielsen N, Can histology solve the riddle of the nonfunctioning electrode? Factors influencing the biocompatibility of brain machine interfaces.Prog Brain Res 194no Issue 181-9 (2011)
[1] Lind G, Linsmeier CE, Thelin J, Schouenborg J, Gelatine-embedded electrodes--a novel biocompatible vehicle allowing implantation of highly flexible microelectrodes.J Neural Eng 7:4, 046005 (2010 Aug)
[2] Biran R, Martin DC, Tresco PA, Neuronal cell loss accompanies the brain tissue response to chronically implanted silicon microelectrode arrays.Exp Neurol 195:1, 115-26 (2005 Sep)
[3] Biran R, Martin DC, Tresco PA, The brain tissue response to implanted silicon microelectrode arrays is increased when the device is tethered to the skull.J Biomed Mater Res A 82:1, 169-78 (2007 Jul)
[4] Thelin J, Jörntell H, Psouni E, Garwicz M, Schouenborg J, Danielsen N, Linsmeier CE, Implant size and fixation mode strongly influence tissue reactions in the CNS.PLoS One 6:1, e16267 (2011 Jan 26)

hide / / print
ref: -0 tags: alumina utah array electrode parylene encapsulation date: 10-23-2015 21:28 gmt revision:1 [0] [head]

Utah/blackrock group has been working on improving the longevity of their parlyene encapsulation with the addition of ~50nm Al2O3.

  • PMID-24771981 '''Self-aligned tip deinsulation of atomic layer deposited Al2O3 and parylene C coated Utah electrode array based neural interfaces
    • Process:
      • Normal Utah array dicing saw / glass frit / thinning and etch fabrication for the Utah probe.
      • Sputtered Ti, Sputtered Pt. (not sure how they mask this?)
      • Sputtered iridium oxide (SIROF, sputtered in an Ar + O2 plasma) electrode tips (again, not sure about the mask..)
      • ALD Al2O3 passivation, 50nm. Cambridge Fiji system, same as nanolab. Must take a long time!
      • A-174, aka 3-Methacryloxypropyltrimethoxysilane adhesion promoter (which presumably acts by pulling hydroxy groups off the alumina substrate; Al-O bonds have higher energy than Si-O)
      • 6um parylene.
      • Laser ablation of tips with 1000 pulses from KrF 5ns 100Hz excimer laser. Works much better than poking the electrode tips through thin aluminum foil.
      • O2 plasma descum / removal of carbon residues.
      • BOE removal of Al2O3 above the SIROF
    • Of note, ALD Al2O3 has included hydroxy bonds, which means that it gradually etches in PBS. (Pure Al2O3, as passivates aluminum parts exposed to seawater, does not?)
    • PBS also etches Si3N4, and crystaline Si.
  • IEEE-6627006 (pdf) Bi-layer encapsulation of utah array based neural interfaces by atomic layer deposited Al2O3 and parylene C
    • Atomic layer deposited (ALD) alumina is an excellent moisture barrier with WVTR at the order of ~ 10e-10 g·mm/m2·day [10-13]. But alumina alone is not suitable for encapsulation since it dissolves in water [14].
    • Demonstrated stable power-up of RF encapsulated devices for up to 600 equivalent days in 37C PBS.
      • Actual testing carried out at 57C, 4x accelerated.
  • PMID-24658358 Long-term reliability of Al2O3 and Parylene C bilayer encapsulated Utah electrode array based neural interfaces for chronic implantation.
    • Demonstrated good barrier longevity with wired Utah probes, active probes with flip-chip (Au/Sn eutectic reflow) record/stimulate circuits, and ones with bonded RF stimulation chips, INIR-6. (6th version!)
    • PBS etching of Si lead to undercutting & eventual flake-off of the SIROF, leading to dramatic impedance increase. (Figure 5 and 7).
      • no Pt under the SIROF?

hide / / print
ref: -0 tags: reactive oxygen accelerated aging neural implants date: 10-07-2015 18:45 gmt revision:1 [0] [head]

PMID-25627426 Rapid evaluation of the durability of cortical neural implants using accelerated aging with reactive oxygen species.

  • Takmakov P1, Ruda K, Scott Phillips K, Isayeva IS, Krauthamer V, Welle CG.
  • TDT W / PI implants completely failed (W etched and PI completely flaked off) after 1 week in 87C H2O2 / PBS solution. Not surprising.
    • In the Au plated W, the Au remained, the PI flaked off, while thin fragile gold tubes were left. Interesting.
  • Pt/Ii + Parylene-C microprobes seemed to fare better; one was unaffected, others experienced a drop in impedance.
  • NeuralNexus (Si3N4 insulated, probably, plus Ir recording pads) showed no change in H2O2 RAA, strong impedance drop (thicker oxide layer?)
  • Same for blackrock / utah probe (Parylene-C), though there the parylene peeled from the Si substrate a bit.

hide / / print
ref: -0 tags: street fighting mathematics Sanjoy Mahajan date: 10-04-2015 23:09 gmt revision:0 [head]



hide / / print
ref: -0 tags: third harmonic generation Nd:YAG pulsed laser date: 08-29-2015 06:44 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

Problem: have a Q-switched Nd:YAG laser, (flashlamp pumped, passively Q-switched) from ebay (see this album). Allegedly it outputs 1J pulses of 8ns duration; in practice, it may put several 100mJ pulses ~ 16ns long while the flashlamp is firing. It was sold as a tattoo removal machine. However, I'm employing it to drill micro-vias in fine polyimide films.

When focused through a 10x objective via the camera mount of an Leica microscope, 532nm (KTP doubled, second harmonic generation (SHG)) laser pulses both ablates the material, but does not leave a clean, sharp hole: it looks more like 'blasting': the hole is ragged, more like a crater. This may be from excessive 1064nm heating (partial KTP conversion), or plasma/flame heating & expansion due to absorption of the 532nm / 1064nm light. It may also be due to excessive pulse duration (should the laser not actually be q-switched... photodiode testing suggests otherwise, but I'd like to verify that), excessive pulse power, insufficient pulse intensity, or insufficient polyimide absorption at 532nm.

The solution to excessive plasma and insufficient polyimide absorption is to shift the wavelength to 355nm (NUV) via third harmonic generation, 1064 + 532 = 355nm. This requires sum frequency generation (SFG), for which LBO (lithium triborate) or BBO (beta-barium borate) seem the commonly accepted nonlinear optical materials.

To get SHG or THG, phase and polarization matching of the incoming light is critical. The output of the Nd:YAG laser is, I assume, non-polarized (or randomly polarized), as the KTP crystal simply screws on the front, and so should be rotationally agnostic (and there are no polarizing elements in the simple laser head -- unless the (presumed) Cr:YAG passive Q-switch induces some polarization.)

Output polarization of the KTP crystal will be perpendicular to the incoming beam; if the resulting THG / SFG crystal needs Type-1 phase matching (both in phase and parallel polarization), will need a half-wave plate for 1064nm; for Type-II phase matching, no plate is needed. For noncritical phase matching in LBO (which I just bought), an oven is required to heat the crystal to the correct temperature.

This suggests 73C for THG, while this suggests 150C (for SHG?).

Third harmonic frequency generation by type-I critically phase-matched LiB3O5 crystal by means of optically active quartz crystal Suggests most lasers operate in Type-1 SHG, and Type-II THG, but this is less efficient than dual Type-1; the quartz crystal is employed to rotate the polarizations to alignment. Both SHG and THG crystals are heated for optimum power output.

Finally, Short pulse duration of an extracavity sum-frequency mixing with an LiB3O5 (LBO) crystal suggests that no polarization change is required, nor oven control LBO temperature. Tight focus and high energy density is required, of course (at the expense of reduced crystal lifetime). Likely this is the Type-1,Type-II scheme alluded to in the paper above. I'll try this first before engaging further complexity (efficiency is not very important, as the holes are very small & material removal may be slow.)

hide / / print
ref: -0 tags: polyimide adhesion delamination Stieglitz date: 08-18-2015 22:19 gmt revision:1 [0] [head]

Thin films and microelectrode arrays for neuroprosthetics

  • Juan Ordonez, Martin Schuettler, Christian Boehler, Tim Boretius and Thomas Stieglitz
  • Discussion of adhesion & ideas of using siliconcarbides as opposed to adhesion promoters (Silane A-174) to maintain good metal-polymer adhesion even with an equilibrium water vapor pressure.
  • Transition metals form carbide bonds with polyimide, but noble metals do not.
  • A one-metal (preferably noble) system is advantageous, as two metals will form a galvanic cell and eventually corrode.
  • Therefore it's best to develop non-metallic non-toxic adhesion promotion technologies.

hide / / print
ref: -0 tags: berkeley airbears2 configuration linux debian 8.1 date: 08-13-2015 23:42 gmt revision:1 [0] [head]

hide / / print
ref: -0 tags: polyimide silicon carbide adhesion DBS syle electrodes date: 07-22-2015 18:01 gmt revision:0 [head]

PMID-25571176 Fabrication and characterization of a high-resolution neural probe for stereoelectroencephalography and single neuron recording.

  • Layer stack:
    • 5um PI (UBE U-varnish S)
    • 50nm SiC
      • Deposited at 100C.
    • 300nm Pt
    • 30nm SiC
    • 10nm DLC
    • 5um PI
      • Cured at 450C
    • 100nm Al hard mask (removed)
    • Cytop dry adhesion layer
      • softbake to remove solvent,
      • then hardbake at 290C for 4 hours to anneal the PI and adhere the Cytop to it.

hide / / print
ref: -0 tags: delete date: 07-06-2015 04:22 gmt revision:2 [1] [0] [head]

hide / / print
ref: -0 tags: delete date: 07-06-2015 04:22 gmt revision:2 [1] [0] [head]

hide / / print
ref: -0 tags: polyimide epoxy potassium hydroxide etch adhesion date: 06-25-2015 00:28 gmt revision:0 [head]

Improvement in the adhesion of polyimide/epoxy joints using various curing agents

  • Used 1M KOH, ~2min, followed by 0.2M HCl for 6 min to ring-open the imide.
  • PMDA/ODA polyimide (Pyromellitic Dianhydride, single aromatic ring + 4,4 diamino diphenyl ether )
  • Epoxy of the DGEBA + linear amide or aromatic (3,3 methylenedianiline)
  • Best result was with a polyamide curing agent, and high-temp curing profile. Unlikely that this will work for us, parylene will decompose..

hide / / print
ref: -0 tags: standard enthalpy chemicals list pdf date: 06-25-2015 00:09 gmt revision:1 [0] [head]

Standard thermodynamic properties of chemical substances

hide / / print
ref: -0 tags: tantalum chromium polyimide adhesion date: 06-24-2015 23:20 gmt revision:1 [0] [head]

Tantalum and chromium adhesion to polyimide. Part 2. Peel and locus of failure analyses

  • CF4 etch followed by Ar sputter yielded the strongest bond to the PI.
  • Suggest that failure may be within the PI (cohesive), not between the PI and metal (adhesive).

Tantalum, tantalum nitride, and chromium adhesion to polyimide: effect of annealing ambient on adhesion

  • The peel adhesion at T-0 (initial) shows the following order: TaNx∼ TaN < Ta∼ Cr, with all samples failing in apparently virgin PI.
  • After ten thermal cycles to 400°C
    • in forming gas the peel adhesion showed the following trend: TaNx < TaN∼ Ta ∼ Cr,
    • whereas if the annealing was done in N2 the order changed to TaNx∼ TaN « Ta < Cr.
  • The peel locus of failure was
    • always in the apparently virgin PI in the Cr/PI samples,
    • while the Ta/PI samples failed in the modified PI,
    • and the TaN/PI and TaNx/PI samples failed between the Ta-nitride and the Cu peel backing film after thermal cycling.

hide / / print
ref: -0 tags: polyimide adhesion chromium copper tie layer upilex date: 06-24-2015 23:14 gmt revision:3 [2] [1] [0] [head]

Adhesion Evaluation of Adhesiveless Metal/Polyimide Substrate for MCM and high density packaging

  • Adhesion degradation after thermal and humidity stresses can occur for a number of reasons.
    • Copper diffusion can promote adhesion loss at elevated temperatures and can be inhibited by coating a barrier layer of metal – tie layer2.
    • Oxygen diffusion through polyimide film to the metal/polyimide interface plays a critical role in promoting degradation too3. Adhesion of Cr/polyimide interface is degraded significantly upon exposure to high temperature and humidity environment due to the hydrolysis of polyimide4,5 .
    • Catastrophic adhesion loss has been linked to moisture induced oxidation of chromium interfaces based on studies using radioactively tagged water4, 5.
  • That said, most of these vendors use Cr (20nm) as and adhesion layer, and Cu (200nm) as the conductor.
  • Upilex A faired very well after the pressure cooker test -- > 60% retention after 192 hours.
  • Seemingly Ta and Cr both adhere similarly to PI -- {1317}
    • Though Ta is much more ductile, and forms a stronger carbide, Cr is preferred... cheaper?

hide / / print
ref: -2008 tags: tantalum chromium polyimide tungsten flexible neural implants adhesion layer date: 06-24-2015 22:53 gmt revision:2 [1] [0] [head]

PMID-18640155 Characterization of flexible ECoG electrode arrays for chronic recording in awake rats.

  • Yeager JD1, Phillips DJ, Rector DM, Bahr DF.
  • We tested several different adhesion techniques including the following: gold alone without an adhesion layer, titanium-tungsten, tantalum and chromium.
  • All films were DC magnetron sputtered, without breaking vacuum between the adhesion layer (5nm) and gold counductor layer (300nm).
  • We found titanium-tungsten to be a suitable adhesion layer considering the biocompatibility requirements as well as stability and delamination resistance.
  • While chromium and tantalum produced stronger gold adhesion, concerns over biocompatibility of these materials require further testing.
    • Thought: use tantalum directly, no Ti needed.
    • Much better than Cr -- much more ductile and biocompatible.
    • Caveat: studies showing reduction to stociometric Ta results in delamination.
  • Ta conductivity: 1.35e-7 Ohms * m; Ti 4.2e-7; 3x better (film can be 3x thinner..)

hide / / print
ref: -0 tags: adhesion polymer metal FTIR epoxy eponol paint date: 05-01-2015 19:20 gmt revision:0 [head]

Degradation of polymer/substrate interfaces – an attenuated total reflection Fourier transform infrared spectroscopy approach

  • Suggests why eponol is used as an additive to paint.
  • In this thesis, attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy has been used to detect changes at the interfaces between poly (vinyl butyral-co-vinyl alcohol-co-vinyl acetate) (PVB) and ZnSe upon exposure to ozone, humidity and UV-B light.
  • Also, the response of PVB-aluminum interfaces to liquid water has been studied and compared with the same for eponol (epoxy resin, diglycidyl ether of bisphenol A)-aluminum interfaces.
  • In the presence of ozone, humidity and UV-B radiation, an increase in carbonyl group intensity was observed at the PVB-ZnSe interface indicating structural degradation of the polymer near the interface. However, such changes were not observed when PVB coated ZnSe samples were exposed to moisture and UV-B light in the absence of ozone showing that ozone is responsible for the observed structural deterioration. Liquid water uptake kinetics for the degraded PVB monitored using ATR-FTIR indicated a degradation of the physical structural organization of the polymer film.
  • Exposure of PVB coated aluminum thin film to de-ionized water showed water incorporation at the interface. There were evidences for polymer swelling, delamination and corrosion of the aluminum film under the polymer layer.
    • On the contrary, delamination/swelling of the polymer was not observed at the eponol-aluminum interface, although water was still found to be incorporated at the interface. Al-O species were also observed to form beneath the polymer layer.
    • A decrease of the C-H intensities was detected at the PVB-aluminum interface during the water uptake of the polymer, whereas an increase of the C-H intensities was observed for the eponol polymer under these conditions.
    • This is assigned to rearrangement of the macromolecular polymer chains upon interaction with water.

hide / / print
ref: -0 tags: Kewame carbon nanotube yarn wet spinning CNT date: 03-26-2015 18:29 gmt revision:0 [head]

Neural Stimulation and Recording with Bidirectional, Soft Carbon Nanotube Fiber Microelectrodes

  • 43um diameter CTN yarn
  • Shows superior charge injection / surface area.
  • polystyrene-polybutadiene co-polymer insulation (like ABS, without the acrylonitrile)
  • https://chemistry.beloit.edu/classes/nanotech/CNT/nanotoday3_5_24.pdf -- details on the process of spinning these CNT yarns.
    • Tensile strength still far below commercial carbon fibers or high-strength polymers.

hide / / print
ref: -2005 tags: electric fish date: 03-24-2015 20:35 gmt revision:0 [head]

Central Neuroanatomy of Electrosensory Systems in Fish

  • See chapter 3, Morphology of Electroreceptive Sensory Organs, Jørgen Mørup Jørgensen
  • It seems that weakly electric fishes improve external field sensitivity by putting the sensory organs at the end of a sometimes long, conductive-mucus filled duct, surrounded by insulating/fatty/tight-junctioned epithelia, and that these sensory cells tend to have microvilli or other surface-area enhancements.
  • Interestingly, in both saltwater and freshwater, the limit of the most sensitive fishes seems to be the inherent noise of the weakly resistive environment -- saltwater fishes are ~10x more acute... I suspect the microvilli may be to combat noise.
  • That said, the duck-billed platypus' electrosensory organs are bare sensory neurons (still at the end of ducts), with a sensitivity of 20-200uV/cm (best saltwater fishes are 1000x better) and a great many of fish seem to have evolved (and lost) electrosensation.

hide / / print
ref: -2002 tags: electric catfish date: 03-24-2015 20:32 gmt revision:0 [head]

PMID-11889591 Spontaneous nerve activity and sensitivity in catfish ampullary electroreceptor organs after tetanus toxin application

  • M. Struik,F. Bretschneider,R. Peters 2002
  • Applied TTX to catfish electrosensitive skin & measured spontaneous and evoked afferent responses.
  • The results show that TeTx reduces sensitivity to less then 20% of its original value, whereas the spontaneous activity is unaffected by the treatment. This indicates that the afferent nerve is capable of generating impulses independent of receptor cell neurotransmitter release.
    • Might mean that the amplifying ion channel is Na-permeable?

hide / / print
ref: -0 tags: electric fish date: 03-24-2015 20:14 gmt revision:2 [1] [0] [head]

The physiology of low-frequency electrosensory systems

  • D Bodznick, JC Montgomery - Electroreception
  • In (teleost) ampullary electroreceptors sensory transduction is direct and accomplished by voltage-gated Ca channels in the receptor cell membranes. It is natural to think that extraordinary receptor sensitivity must require extraordinarily sensitive ion channels, but the pharmacology and electrical properties of summed receptor currents indicate that they are very similar to N and L type Ca channels (Sugawara 1989b; Lu and Fishman 1995 {1310}).
    • Instead, the high sensitivity of the electroreceptors appears to derive at least in part from (1) the elimination of response threshold by maintaining the receptor cells in a partially activated state or oscillating even in the absence of a stimulus, (2) DC synapses that also lack a threshold and steadily release transmitter as a function of membrane potential, and (3) a very large convergence ratio of receptor cells to afferent fibers. The structure of the receptor organs also plays a role by ensuring that nearly all the available stimulus potential is brought to bear directly across the sensory epithelium. The long canals in marine forms contribute particularly to sensitivity in shallow voltage gradients.
    • Teleost ampullary electroreceptors apparently evolved from lateral line mechanoreceptors and in each case the receptor cell basal membrane remains the voltage sensor while the apical membrane is passive and low resistance.
    • In Plotosus and probably other teleosts an electrogenic Na–K pump in the basal surface of the receptor epithelium provides a steady outward bias current across the sensory epithelium at rest (Sugawara 1989a). This current is set at a level to partially activate a noninactivating Ca conductance in the basal receptor cell membrane, and voltage-clamp measures show that the receptor epithelium sits in he negative slope conductance region of its summed I–V curve even at rest.
  • I find some of their explanation of the ionic currents to be loose and unclear; may stem from the fact that research is ongoing.

hide / / print
ref: -0 tags: microflex interconnect polyimide Stieglitz date: 03-03-2015 00:33 gmt revision:1 [0] [head]

IEEE-938305 (pdf) High Density Interconnects and flexible hybrid assemblies for active biomedical implants

  • Idea: make vias in your metallized PI film. Bump-bond through these vias to a chip below.
  • Achieve center-to -center distances of 100um.
  • No longer using this? See {1250}, which uses thermosonic bonding.

hide / / print
ref: -0 tags: polyimide polyamide basic reduction salt surface modification date: 02-27-2015 19:45 gmt revision:0 [head]

Kinetics of Alkaline Hydrolysis of a Polyimide Surface

  • The alkaline hydrolysis of a polyimide (PMDA-ODA) surface was studied as a function of time, temperature and hydroxide ion concentration.
  • Quantification of the number of carboxylic acid groups formed on the modified polyimide surface was accomplished by analysis of data from contact angle titration experiments.
  • Using a large excess of base, pseudo-first-order kinetics were found, yielding kobs ≈ 0.1−0.9 min-1 for conversion of polyimide to poly(amic acid) depending on [OH-].
  • From the dependence of kobs on [OH-], a rate equation is proposed.
  • Conversion of the polyimide surface to one of poly(amic acid) was found to reach a limiting value with a formation constant, K, in the range 2−10 L·mol-1.

hide / / print
ref: -2000 tags: polyimide acrylic aluminum electro deposition imide insulation ultra thin date: 02-27-2015 19:42 gmt revision:0 [head]

Ultrathin, Layered Polyamide and Polyimide Coatings on Aluminum

  • Alternating polyelectrolyte deposition of layered poly(acrylic acid)/poly(allylamine hydrochloride) (PAA/PAH) films on Al produces ultrathin coatings that protect Al from Cl--induced corrosion.
  • Resistance goes from 5 MOhm/cm^2 at 10nm thickness to ~50MOhm/cm^2 following imidization of the monolayer-applied polymer films.

hide / / print
ref: -0 tags: gold carbon nanotube electroplating impedance PEG date: 10-24-2014 22:25 gmt revision:1 [0] [head]

PMID-21379404 Creating low-impedance tetrodes by electroplating with additives

  • Electroplated tetrodes to 30-70 kΩ by adding polyethylene glycol (PEG) or multi-walled carbon nanotube (MWCNT) solutions to a commercial gold-plating solution.
  • Cui and Martin [12] showed that altering the concentration of gold-plating solution and electroplating current can change the morphology of a gold-plated microelectrode coating.
  • Additionally, Keefer et al. [13] found that adding multi-walled carbon nanotubes (MWCNTs) to a gold-plating solution created microelectrode coatings with a “rice-like” texture and very low impedances.
  • Au electroplating solution made of non-cyanide, gold-plating solution (5355, SIFCO Selective Plating, Cleveland, OH).
  • A one-second, reversed-polarity pulse helped to clean the surface of the tetrode tip and lowered the impedances to 2MΩ to 3 MΩ before electroplating.
  • Electroplating pulses were one to five seconds long and were repeated until the tetrodes reached the desired impedances. After electroplating, the tetrodes were soaked in DI, air dried, and checked for shorts.

Conclusion: 75% PEG, commercial electropating solution, 0.1ua current pluses to 250K or less.

  • Though the Caswell Au plating solution will likely behave differently ..

hide / / print
ref: -0 tags: delete date: 10-11-2014 00:50 gmt revision:1 [0] [head]

  1. 1000A Oxide wafer, as purchased.
This makes PI release easy. Native, piranha / HF cleaned wafers are another option; will then require a heated bath soak & dehydrate prior parylene deposition. Unclear which is best; biased to the former.
  1. PI spin, bake 350C
  2. AR, N2, O2 plasma etch.
Need to run some tests to optimize adhesion protocol, as well as decide between Al and Ti.
  1. LOR-5A, I-line lift-off
though: others use G-line resist for Pd; perhaps it's more thermally resistant?
  1. Al - Pd - Au - Pd metalization.
Wait between layers to allow everything to cool down. Make the Al layer thicker to act as a stress buffer? (80nm?) Or try a thin layer of chromium -- 5nm should be fine. {1265} This is a temporary solution, as the Cr-PI interface is hydrolized gradually.
  1. If the adhesion is sufficient, can lift-off in the ultrasonicator, which saves time & reduces PI exposure to NMP.
In my experience, Cr makes this significantly easier. Otherwise, redesign the mask to make
  1. Dry. overnight + vacoven.
  2. Spin second layer of PI, cure 410C
  3. Etch contact sites -- both recording and bond.
  4. Lift-off to pattern recording sites
Worried that Pd may be oxidized by O2 plasma -> top layer should be Au. This will also serve to define the head geometry, and prevent excess etch w/ parylene. Add patterning in the mask to make the liftoff easier Add masking to stop parylene etch. Do this for entire outline? Would make etch significantly easier.
  1. Plasma process for de-scum & adhesion
  2. Evaporate second metal stack
Al - Pd - Au or Cr - Au or Ti - Au (since adhesion will not, at this point, be subject to ultrasonic energy).
  1. O2 plasma etch full device outlines
Should be sharp, since we'll have a full hard mask. That said: this may affect fiber release from parylene, due to slightly different sidewall profiles. Hence this step could be risky -- or could be beneficial, as we can soak indefinitely with only mechanical interlock.
  1. PR mask regions we want metal2 to stay
  2. Au & Ti etch other areas, leaving hard masks for parylene etch.
  3. HMDS application to make sure the resist stays securely.
  4. PR mask (thick!) areas not to be plated
This is to prevent NiSO4 / NaH2PO4 from releasing the electrodes prematurely. And to keep everything clean from the dirty Ni bath ** must be tested, perhaps on next wafer! **
  1. Ni plate to 10um
  2. Flash Au plate
This keeps O2 etch from oxidizing Ni & prevents parlene from sticking to the Ni.
  1. PR protect vital bonding sites (optional?)
  2. Plasma descum of any residual HMDS
  3. Parylene dep, 3-5um
  4. SPR-220-7 thick resist, 10um
  5. O2 etch to pattern parylene.
But not over the bondpad sites!! Parylene scum messes with both wirebonding and ACF bonding.
  1. Removal of parylene over bondpad sites -- before taking off the wafer. Clean in acetone.
  2. Removal of electrodes from wafer using water (bonding area) then IPA (electrode shank), clean protection PR in acetone.
Removal should be easy from 1000A SiO2 wafers.
  1. Bond to cartridges
  2. Soak in water 24 hours to lessen PI-parylene adhesion.

hide / / print
ref: -0 tags: kevlar electrodes flexible polymer 12um McNaughton Utah date: 10-11-2014 00:19 gmt revision:0 [head]

PMID-8982987 Metallized polymer fibers as leadwires and intrafascicular microelectrodes

  • McNaughton TG1, Horch KW.
  • Ti/W, Au, Pt metalization via sputtering.
  • 12um core diamater.
  • demonstrate 8 month reliability.
  • 1um dipped silicone elastomer insulation.
  • note difficulty in manufactuing the fibers. No kidding!
  • Tensile strength the same as a 25um Pt-Ir wire, 90x more flexible.

hide / / print
ref: -0 tags: nickel chrome polyimide adhesion date: 10-11-2014 00:13 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

Adhesiveless copper on polyimide substrate with nickel-chromium tiecoat

  • Chrome works the best, with Nichrome lagging slightly behind. Thicker tie layers (20nm) work slightly better.
  • 17 nm Cr and 5nm NiCr both work well after gold plating
    • in aggressive cyanide solution -- without tie layer, the copper was released.
    • note how thin the layers are!
  • Surface benefits from oxygen plasma pre-treatment. (de-scum?)
  • Still not sure how to get second layer of polyimide to adhere to top layer of Cr.

Adhesion Between Polymers and Other Substances - A Review of Bonding Mechanisms, Systems and Testing

  • The adhesion between the polyimide, PMDA-ODA and metals such as copper or chromium has received considerable attention due to its importance in the microelectronics industries.
  • As mentioned, the PMDA-ODA is normally deposited from solution as the polyamic acid and cured in-situ to the imide form.
  • Adhesion of the polyimide deposited on a metal is therefore a different problem than adhesion of a metal deposited on the cured polyimide.
  • The former situation (polyimide on metal) tends to give stronger adhesion than the latter (metal on polyimide) but there can be problems of metal, particularly copper, dissolution.
  • Great! (is this a reliable source?)
  • The interaction between the metals and the polyimide has been studied in great detail using x-ray photoelectron spectroscopy (XPS) and other surface analysis techniques but there is not complete agreement on the form of the interaction.
    • It is clear that strong interaction and electron transfer occurs when the metal is deposited from vapour onto the polyimide.
    • When the polyamic acid is deposited on the metal and cured then reaction occurs between the acid and the metal.
  • The strong interface formed between chromium and the polyimide is clearly a result of the strong chemical interaction but there is still considerable interest in making it more resistant to water and oxidation.

High-Performance Polymers (book) Guy Rabilloud (via google books.)

  • Order of metals by increasing adhesion:
    • Cu, Pd, Ni, V, Cr, Nb, Ti [140]
  • The adhesion between chromium and polyimide is degraded sharply as the interface is exposed to temperature-humidity stressing (85C, 81% RH [612]
  • Polyimide-polyimide self-adhesion strongly benefits from partial cure of the first layer (which is not possible with lithographic processes, TMAH etches uncured film). Plasma and adhesion treatments would likely help, due to molecular tangling (?). Presumably VM-651 helps. We'll cross that bridge when we get to it.
  • PMDA-PPD or PMDA-PDA is perhaps the most rigid of all the polyimides, but due to the extremely hydrophillic nature of PMDA & associated electron affinity of the dianhydride ( E aE_a ), and the fact that it tends to crystalize & not be tough/plastic, it's infrequently used.

hide / / print
ref: -0 tags: polybenzoxazole PBO synthesis zylon date: 10-10-2014 22:40 gmt revision:0 [head]

Synthesis and thermal properties of polybenzoxazole from soluble precursors with hydroxy-substituted polyenaminonitrile

  • Process:
    1. purified/distilled reagents
    2. made the CCB, a open-ring soluble precursor
    3. eluted the CCB in water / methanol
    4. thermoset the resulting polymer.
  • No control of molecular weight, nor material properties of the cured film.
  • Resultant film was highly temperature resistant, though.

hide / / print
ref: -0 tags: Peter Ledochowitsch ECoG parylene fabrication MEMS date: 09-25-2014 16:54 gmt revision:0 [head]

IEEE-5734604 (pdf) Fabrication and testing of a large area, high density, parylene MEMS µECoG array

  • Details 5-layer platinum parylene process for high density ECoG arrays.

hide / / print
ref: -0 tags: wirebonding finishes gold nickel palladium electroless electrolytic date: 09-21-2014 02:53 gmt revision:3 [2] [1] [0] [head]

Why palladium?

To prevent black nickel: http://tayloredge.com/reference/Electronics/PWB/BlackPad_ITRI_Round1.PD

Introduction The use of electroless nickel / immersion gold (E.Ni/I.Au) as a circuit board finish has grown significantly in the last few years. It provides a flat board finish, is very solderable, provides a precious metal contact surface and the nickel strengthens the plated holes. However, as the usage of E.Ni/I.Au increased, a problem was found on BGA (Ball Grid Array) components. An open or fractured solder joint sometimes appears after board assembly on the occasional BGA pad. The solder had wet and dissolved the gold and formed a weak intermetallic bond to the nickel. This weak bond to the nickel readily fractures under stress or shock, leaving an open circuit. The incidence of this problem appears to be very sporadic and a low ppm level problem, but it is very unpredictable. A BGA solder joint cannot be touched-up without the component being removed. After the BGA component is removed, a black pad is observed at the affected pad site. This black pad is not readily solderable, but it can be repaired.

From: http://www.smtnet.com/Forums/index.cfm?fuseaction=view_thread&Thread_ID=4430

You don't have enough gold. Your 2uin is too porous and is allowing the nickel to corrode. Prove that this by hand soldering to these pads with a more active flux, like a water soluble solder paste, than you are using.

You must have at least 3uin of immersion gold. Seriously consider >5uin.

Your nickel thickness is fine. Although if you wanted to trade costs, consider giving-up nickel to 150uin thickness, while increasing the gold thickness. Gold over electroless nickel creates brittle joints because of phosphorous in the nickel plating bath. The phosphorous migrates into the over-plating. Electrolytic nickel and gold plating should not be a problem.

If you stay with the electroless nickel, keep the phosphorous at a mid [7 - 9%] level. Just as important, don't let the immersion gold get too aggressive. The immersion gold works by corroding the nickel. If it is too aggressive it takes away the nickel and leave phosphorous behind. This makes it look like the phosphorous level is too high in the nickel bath.

Gold purity is very important for any type of wire bonding process. For aluminum wedge bonding, gold should have a purity of 99. 99% [no thalium] and the nickel becomes critical. No contaminates and the nickel wants to be plated a soft as possible. This requires good control of Ph and plating chemicals in the nickel-plating bath.

Harman "Wire Bonding In Microelectronics" McGraw-Hill is a good resource for troubleshooting wire bonding. I reviewed it in the SMTnet Newsletter a couple of months ago.

That said, electrolytic nickel + electrolytic gold does work well -- perhaps even better than ENEPIG:

hide / / print
ref: -0 tags: RF microstimulation cats threshold date: 09-04-2014 18:43 gmt revision:1 [0] [head]

PMID-13539663 Subcortical threshold voltages as a function of sine wave frequencies Brown and Brackett

  • 22 GA insulated stainless steel electrodes, both bipolar and monopolar.
    • This happens to be near spike recording passband, unfortunately.
  • Square wave stimulation (8) Mihailovic and Delgado 1956 "Electrical stimulation of monkey brain with various frequencies and pulse durations".
  • Hines (6)(1940) , stimulating the monkey cortex with [a] sine wave, reported jerky uncompleted movements from 1260 Hz to 1440 Hz.
    • Monopolar surface stimulation, though.

hide / / print
ref: Cosman-2005.12 tags: microstimulation RF pain neural tissue ICMS date: 09-04-2014 18:10 gmt revision:14 [13] [12] [11] [10] [9] [8] [head]

One of the goals/needs of the lab is to be able to stimluate and record nervous tissue at the same time. We do not have immediate access to optogenetic methods, but what about lower frequency EM stimulation? The idea: if you put the stimulation frequency outside the recording system bandwidth, there is no need to switch, and indeed no reason you can't stimulate and record at the same time.

Hence, I very briefly checked for the effects of RF stimulation on nervous tissue.

  • PMID-16336478[0] Electric and Thermal Field Effects in Tissue Around Radiofrequency Electrodes
    • Most clinical response to pulsed RF is heat ablation - the RF pulses can generate 'hot spots' c.f. continuous RF.
    • Secondary effect may be electroporation; this is not extensively investigation.
    • Suggests that 500kHz pulses can be 'rectified' by the membrane, and hence induce sodium influx, hence neuron activation.
    • They propose that some of the clinical effects of pulsed RF stimulation is mediated through LTD response.
  • {1297} -- original!
  • PMID-14206843[2] Electrical Stimulation of Excitable Tissue by Radio-Frequency Transmission
    • Actually not so interesting -- deals with RF powered pacemakers and bladder stimulators; both which include rectification.
  • Pulsed and Continous Radiofrequency Current Adjacent to the Cervical Dorsal Root Ganglion of the Rat Induces Late Cellular Activity in the Dorsal Horn
    • shows that neurons are activated by pulsed RF, albeit through c-Fos staining. Electrodes were much larger in this study.
    • Also see PMID-15618777[3] associated editorial which calls for more extensive clinical, controlled testing. The editorial gives some very interesting personal details - scientists from the former Soviet bloc!
  • PMID-16310722[4] Pulsed radiofrequency applied to dorsal root ganglia causes a selective increase in ATF3 in small neurons.
    • used 20ms pulses of 500kHz.
    • Small diameter fibers are differentially activated.
    • Pulsed RF induces activating transcription factor 3 (ATF3), which has been used as an indicator of cellular stress in a variety of tissues.
    • However, there were no particular signs of axonal damage; hence the clinically effective analgesia may be reflective of a decrease in cell activity, synaptic release (or general cell health?)
    • Implies that RF may be dangerous below levels that cause tissue heating.
  • Cellphone Radiation Increases Brain Activity
    • Implies that Rf energy - here presumably in 800-900Mhz or 1800-1900Mhz - is capable of exciting nervous tissue without electroporation.
  • Random idea: I wonder if it is possible to get a more active signal out of an electrode by stimulating with RF? (simultaneously?)
  • Human auditory perception of pulsed radiofrequency energy
    • Evicence seems to support the theory that it is local slight heating -- 6e-5 C -- that creates pressure waves which can be heard by humans, guinea pigs, etc.
    • Unlikely to be direct neural stimulation.
    • High frequency hearing is required for this
      • Perhaps because it is lower harmonics of thead resonance that are heard (??).

Conclusion: worth a shot, especially given the paper by Alberts et al 1972.

  • There should be a frequency that sodium channels react to, without inducing cellular stress.
  • Must be very careful to not heat the tissue - need a power controlled RF stimulator
    • The studies above seem to work with voltage-control (?!)


[0] Cosman ER Jr, Cosman ER Sr, Electric and thermal field effects in tissue around radiofrequency electrodes.Pain Med 6:6, 405-24 (2005 Nov-Dec)
[1] Alberts WW, Wright EW Jr, Feinstein B, Gleason CA, Sensory responses elicited by subcortical high frequency electrical stimulation in man.J Neurosurg 36:1, 80-2 (1972 Jan)
[3] Richebé P, Rathmell JP, Brennan TJ, Immediate early genes after pulsed radiofrequency treatment: neurobiology in need of clinical trials.Anesthesiology 102:1, 1-3 (2005 Jan)
[4] Hamann W, Abou-Sherif S, Thompson S, Hall S, Pulsed radiofrequency applied to dorsal root ganglia causes a selective increase in ATF3 in small neurons.Eur J Pain 10:2, 171-6 (2006 Feb)

hide / / print
ref: -0 tags: physical principles of scalable neural recording marblestone date: 08-25-2014 20:21 gmt revision:0 [head]

PMID-24187539 Physical principles for scalable neural recording.

  • Marblestone AH1, Zamft BM, Maguire YG, Shapiro MG, Cybulski TR, Glaser JI, Amodei D, Stranges PB, Kalhor R, Dalrymple DA, Seo D, Alon E, Maharbiz MM, Carmena JM, Rabaey JM, Boyden ES, Church GM, Kording KP.

hide / / print
ref: -0 tags: hike tamalpais kent lake california date: 08-24-2014 19:17 gmt revision:1 [0] [head]

Pretty solid hike yesterday. 25.25 miles (or likely more, given the limited resolution of my tracing) in about 7.5 hours for an average speed of 3.4 mph. Lots of different terrain and eosystems along the way -- redwoods to lakeside to golden grassy hilltops to manzanita / scrub forest. Would be good for mountain biking.

hide / / print
ref: -0 tags: intracortical utah array fabrication MEMS Normann date: 08-14-2014 01:35 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-1937509 A silicon-based, three-dimensional neural interface: manufacturing processes for an intracortical electrode array.

  • Campbell PK1, Jones KE, Huber RJ, Horch KW, Normann RA. (1991)
  • One of (but not the) first papers describing their methods / idea (I think).
  • First conf paper: {1294} (1988)
  • later adopted glass frit insulator --

hide / / print
ref: -0 tags: utah array development date: 08-14-2014 01:34 gmt revision:1 [0] [head]

IEEE-94953 (pdf) Silicon based microstructures suitable for intracortical electrical stimulation (visual prosthesis application)

  • Normann, R.A. ; Dept. of Bioeng., Utah Univ., Salt Lake City, UT, USA ; Campbell, P.K. ; Li, W.P.
  • 1988
  • not quite yet there...

hide / / print
ref: -0 tags: utah array development failure mode donoghue date: 08-14-2014 01:30 gmt revision:0 [head]

PMID-24216311 Failure mode analysis of silicon-based intracortical microelectrode arrays in non-human primates

  • Barrese JC, Rao N, Paroo K, Triebwasser C, Vargas-Irwin C, Franquemont L, Donoghue JP. (2013)
  • Most failures (56%) occurred within a year of implantation, with acute mechanical failures the most common class (48%), largely because of connector issues (83%).
  • Among grossly observable biological failures (24%), a progressive meningeal reaction that separated the array from the parenchyma was most prevalent (14.5%).

hide / / print
ref: Nordhausen-1994.02 tags: Utah array electrodes optimization date: 08-14-2014 01:24 gmt revision:2 [1] [0] [head]

PMID-8180807[0] Optimizing recording capabilities of the Utah Intracortical Electrode Array.

  • Nordhausen, Rousch, Normann (1993)
  • Originally it was designed for stimulation in a visual prosthesis.
  • Thought that the large surface area would securely anchor it to the cortex
    • Turns out you need to put gore-tex on top to keep it from being expelled.
  • Varied the exposed electrode tip to determine the optimum area.
  • Oldschool computer plots ...


[0] Nordhausen CT, Rousche PJ, Normann RA, Optimizing recording capabilities of the Utah Intracortical Electrode Array.Brain Res 637:1-2, 27-36 (1994 Feb 21)

hide / / print
ref: -0 tags: tungsten welding CVD arc braze 1971 date: 08-12-2014 20:56 gmt revision:0 [head]

Weldability of Tungsten and Its Alloys

  • tried relatively exotic brazing methods:
    • Niobium,
    • Tantalum
    • W - 26% Re
    • Mo
      • No mention of what we'll be doing (NiCr resistance wire -- only easily available fine wire)
  • Note that the ductile-to-brittle transition is low for their samples, 150-250C.
  • Samples made via arc-melting or WF + H2 CVD.

hide / / print
ref: Seymour-2011.06 tags: PEDOT Seymour electrode recording parylene date: 08-06-2014 22:39 gmt revision:3 [2] [1] [0] [head]

PMID-21301965[0] Novel multi-sided, microelectrode arrays for implantable neural applications.

  • There are problems with parylene multielectrode arrays:
    • water and salts will rapidly diffuse into the various interfacial boundaries
    • Interfacial delamination due to poor wet adhesion of parylene on metal
      • This possibly due to mechanical stress
      • causes excessive cross-talk or noise.
    • Parylene-C devices are prone to poor adhesion at either the dielectric to dielectric interface or at the dielectric to metal interface *** (Sharma and Yasuda 1982; Yasuda et al 2001)
  • solution: PPXCH 2NH 2PPX-CH_2NH_2 and PPXCHOPPX-CHO -- reactive parylene (amine bonds?!)
  • PEDOT is absolutely essential for attaining reasonable performance / impedance from the 85um^2 gold electrodes.
    • Thermal noise on 280um^2 and 170um^2 Au electrodes was too high to record neurons.
    • AU thickness 0.5um.
  • Performed soak tests on their electrodes; the reactive parylene is good, but not sure if it's a worthy improvement.


[0] Seymour JP, Langhals NB, Anderson DJ, Kipke DR, Novel multi-sided, microelectrode arrays for implantable neural applications.Biomed Microdevices 13:3, 441-51 (2011 Jun)

hide / / print
ref: -0 tags: neurite growth factor NGF 1977 date: 08-05-2014 23:02 gmt revision:0 [head]

PMID-270699 Local control of neurite development by nerve growth factor.

  • After neurites crossed the barrier (fluid barrier to NGF), local removal of nerve growth factor from the distal portions of the neurites caused the growth of these portions to stop, and they eventually appeared to degenerate even though nerve growth factor was continuously present in the chamber that contained their somas and proximal portions.
  • In contrast, local nerve growth factor was not required at the somas and proximal portions of the neurites; many neurons survived its withdrawal provided their somas were associated with neurite bundles that crossed into a chamber containing nerve growth factor.

hide / / print
ref: -0 tags: debugging reinvented java CMU code profiling instrumentation date: 08-02-2014 06:32 gmt revision:3 [2] [1] [0] [head]

images/1289_1.pdf -- Debugging reinvented: Asking and Answering Why and Why not Questions about Program Behavior.

  • Smart approach to allow users to quickly find the causes of bugs (or more generically, any program actions).

hide / / print
ref: -0 tags: automatic programming inductive functional igor date: 07-29-2014 02:07 gmt revision:0 [head]

Inductive Rule Learning on the Knowledge Level.

  • 2011.
  • v2 of their IGOR inductive-synthesis program.
  • Quote: The general idea of learning domain specific problem solving strategies is that first some small sample problems are solved by means of some planning or problem solving algorithm and that then a set of generalized rules are learned from this sample experience. This set of rules represents the competence to solve arbitrary problems in this domain.
  • My take is that, rather than using heuristic search to discover programs by testing specifications, they use memories of the output to select programs directly (?)
    • This is allegedly a compromise between the generate-and-test and analytic strategies.
  • Description is couched in CS-lingo which I am inexperienced in, and is perhaps too high-level, a sin I too am at times guilty of.
  • It seems like a good idea, though the examples are rather unimpressive as compared to MagicHaskeller.

hide / / print
ref: -0 tags: polyimide anodic release 2005 date: 06-16-2014 23:58 gmt revision:1 [0] [head]

IEEE-1416914 (pdf) Partial release and detachment of microfabricated metal and polymer structures by anodic metal dissolution

  • recommend 100nm Cr/Al release layer.
  • finished devices just 'float to the surface' of saline solution.

hide / / print
ref: -0 tags: polyimide silicon oxide aluminum adhesion pressure cooker date: 06-16-2014 21:28 gmt revision:2 [1] [0] [head]

Interfacial adhesion of polymeric coatings for microelectronic encapsulation

  • Find that, after a pressure-cooker test, adhesion of polyimide PI-2610 (what we use) to SiO2 was weaker than to Al, SiN, and copper.
  • Aluminum adhesion is quite good, at least to (only) 15 days @ 85C / 85% RH. Reference studies that find the adhesion to be 'acceptable' for the microelectronics industry.
    • Should we use an aluminum adhesion layer? Less biocompatible metal than Ti, and more likely to degrade in saline.
  • Found that copper adhesion actually went up with water exposure!
  • Polyimide adheres more strongly to glass than epoxy following accelerated aging.

hide / / print
ref: -0 tags: maleimide azobenzine glutamate photoswitch optogenetics date: 06-16-2014 21:19 gmt revision:0 [head]

PMID-16408092 Allosteric control of an ionotropic glutamate receptor with an optical switch

  • 2006
  • Use an azobenzene (benzine linked by two double-bonded nitrogens) as a photo-switchable allosteric activator that reversibly presents glutamate to an ion channel.
  • PIMD:17521567 Remote control of neuronal activity with a light-gated glutamate receptor.
    • The molecule, in use.
  • Likely the molecule is harder to produce than channelrhodopsin or halorhodopsin, hence less used. Still, it's quite a technology.

hide / / print
ref: -0 tags: ovipositor wasp fig needle insertion SEM date: 05-29-2014 19:58 gmt revision:0 [head]

Biomechanics of substrate boring by fig wasps

  • Lakshminath Kundanati and Namrata Gundiah 2014

hide / / print
ref: -0 tags: noise triboelectric implant BMI date: 05-16-2014 17:28 gmt revision:1 [0] [head]

source -- Durand

hide / / print
ref: -0 tags: optogenetics glutamate azobenzine date: 05-07-2014 19:51 gmt revision:0 [head]

PMID-17521567 Remote control of neuronal activity with a light-gated glutamate receptor.

  • Neuron 2007.
  • azobenzines undergo a cis to trans confirmational change via illumination with one wavelength, and trans to cis via another. (neat!!)
  • This was used to create photo-controlled (on and off) glutamate channels.

hide / / print
ref: -0 tags: polyimide adhesion aluminum integrated circuit date: 05-07-2014 19:29 gmt revision:0 [head]

Polyimide insulators for multilevel interconnections Arthur M. Wilson

  • Old article (1981), but has useful historical information on the development of various PI insulators and their adhesion to aluminum, SiOx, etc.
  • Suggests that a higher-temperature cure (400C) is needed to fully drive water from the PI & cause a glass-transition. Might want to do this for the second PI layer.

hide / / print
ref: -0 tags: microelectrode patents date: 05-02-2014 00:07 gmt revision:1 [0] [head]

Various microelectrode patents:


hide / / print
ref: -0 tags: kevlar polyamide orientation thin-films date: 04-07-2014 19:08 gmt revision:1 [0] [head]

Preparation of uniaxially oriented polyamide films by vacuum deposition polymerization

  • Jiro Sakata, Midori Mochizuki
  • Grew polyamide (PPTA, kevlar) films using VDP (vacuum deposition polymerization).
    • Two precursors were heated in a vacuum to yield a stoichiometric polymer.
  • Polymer chains were oriented with rubbing with different polymers, e.g. cotton (!)

hide / / print
ref: -0 tags: parylene plasma ALD insulation long-term saline PBS testing date: 04-02-2014 21:32 gmt revision:0 [head]

PMID-23024377 Plasma-assisted atomic layer deposition of Al(2)O(3) and parylene C bi-layer encapsulation for chronic implantable electronics.

  • This report presents an encapsulation scheme that combines Al(2)O(3) by atomic layer deposition with parylene C.
  • Al2O3 layer deposited using PAALD process-500 cycles of TMA + O2 gas.
  • Alumina and parylene coating lasted at least 3 times longer than parylene coated samples tested at 80 °C
    • That's it?
  • The consistency of leakage current suggests that no obvious corrosion was occurring to the Al2O3 film. The extremely low leakage current (≤20 pA) was excellent for IDEs after roughly three years of equivalent soaking time at 37 °C.
    • Still, they warn that it may not work as well for in-vivo devices, which are subject to tethering forces and micromotion.

hide / / print
ref: -0 tags: carbon fiber electrode array parylene fire sharpening microthread date: 03-20-2014 19:57 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-23860226 A carbon-fiber electrode array for long-term neural recording.

  • Guitchounts G1, Markowitz JE, Liberti WA, Gardner TJ.
  • We describe an assembly method for a 16-channel electrode array consisting of carbon fibers (<5 µm diameter) individually insulated with Parylene-C and fire-sharpened. The diameter of the array is approximately 26 µm along the full extent of the implant.
  • Fibers from http://www.goodfellowusa.com/
    • young's modulus of 380GPa vs. tungsten 400GPa.
    • Data available from Toho Tenax
  • The absence of any report of single neuron isolation in HVC with a fixed chronic electrode implant underscores the difficulty of recording small cells (8-15um soma) with an implant whose damage length scale is large relative to the target neurons. (!!)

hide / / print
ref: -0 tags: polyimide adhesion oxygen nitrogen plasma surface energy date: 03-10-2014 22:33 gmt revision:0 [head]

Adhesion Properties of Electroless-Plated Cu Layers on Polyimide Treated by Inductively Coupled Plasmas

  • O2 then N2/H2 ICP treatment of polyimide surfaces dramatically lowers the surface energy (as measured by contact angle), and increases the adhesion of palladium-catalyzed electroless copper.
  • Particularly, C-N bonds are increased as revealed by XPS.
  • No peel-strength measurements given.

hide / / print
ref: -0 tags: flexible neural probe polyimide silicon polyethylene glycol dissolvable jove livermore loren frank date: 03-05-2014 19:18 gmt revision:0 [head]


  • details the flip-chip bonding method (clever!)
  • as well as the silicon stiffener fabrication process.

hide / / print
ref: -0 tags: palladium metal glass tought strong caltech date: 02-25-2014 19:02 gmt revision:1 [0] [head]

A damage-tolerant glass

  • Perhaps useful for the inserter needle?
  • WC-Co Tungesten carbide-cobalt cermet is another alternative.

hide / / print
ref: -0 tags: spectroscopy frequency domain PMT avalanche diode laser Tufts date: 02-25-2014 19:02 gmt revision:0 [head]

Frequency-domain techniques for tissue spectroscopy and imaging

  • 52 pages, book chapter
  • Good detail on bandwidth, tissue absorption, various technologies.

hide / / print
ref: -0 tags: hinton convolutional deep networks image recognition 2012 date: 01-11-2014 20:14 gmt revision:0 [head]

ImageNet Classification with Deep Convolutional Networks

hide / / print
ref: -0 tags: perl directory descent script remove date: 01-10-2014 06:12 gmt revision:0 [head]

Simple perl scrip for removing duplicate files within sub-directories of a known depth:

#!/usr/bin/perl -w

@files = <*>;
foreach $file (@files) {
	@files2 = <$file/*>;
	foreach $file2 (@files2) {
		print $file2 . "\n";
		`rm -rf $file2/*_1.jpg`; 
		`rm -rf $file2/*_2.jpg`; 

hide / / print
ref: Cheung-2007.03 tags: flexible electrode array Michigan probe histology Vancouver current source density EPFL polyimide date: 12-21-2013 21:07 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-17027251[0] Flexible polyimide microelectrode array for in vivo recordings and current source density analysis.

  • Polyimide -- PI-2611 precusor.
  • 50nm Ti adhesion, 200nm Pt, both sputtered.
  • Electrodes etched via RIE in Cl2.
    • Sputtered and photo-patterned SiO2 etch mask.
  • Used regular solder to connect to a Samtec.
  • 15um total thickness.
  • 25um electrode diameter.
  • They were inserted directly (no carrier nor guide) into the brain; can be re-used.
  • Tested to 8 weeks.
  • No figure comparing silicon and polyimide, though they claim minimal GFAP response to the electrodes.


[0] Cheung KC, Renaud P, Tanila H, Djupsund K, Flexible polyimide microelectrode array for in vivo recordings and current source density analysis.Biosens Bioelectron 22:8, 1783-90 (2007 Mar 15)

hide / / print
ref: -0 tags: stretchable nanoparticle conductors gold polyurethane flocculation date: 12-13-2013 02:12 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-23863931 Stretchable nanoparticle conductors with self-organized conductive pathways.

  • 13nm gold nanoparticles, citrate-stabilized colloidal solution
    • Details of fabrication procedure in methods & supp. materials.
  • Films are prepared in water and dried (like paint)
  • LBL = layer by layer. layer of polyurethane + layer of gold nanoparticles.
    • Order of magnitude higher conductivity than the
  • VAF = vacuum assisted floculation.
    • Mix Au-citrate nanoparticles + polyurethane and pass through filter paper.
    • Peel the flocculant from the filter paper & dry.
  • Conductivity of the LBL films ~ 1e4 S/cm -> 1e-6 Ohm*m (pure gold = 2 x 10-8, 50 x better)
  • VAF = 1e3 S/cm -> 1e-5 Ohm*m. Still pretty good.
    • This equates to a resistance of 1k / mm in a 10um^2 cross-sectional area wire (2um x 5 um, e.g.)
  • The material can sustain > 100% strain when thermo-laminated.
    • Laminated: 120C at 20 MPa for 1 hour.
  • See also: Preparation of highly conductive gold patterns on polyimide via shaking-assisted layer-by-layer deposition of gold nanoparticles
    • Patterned via MCP -- microcontact printing(aka rubber-stamping)
    • Bulk conductivity of annealed (150C) films near that of pure gold (?)
    • No mechanical properties, though; unlcear if these films are more flexible / ductile than evaporated film.

hide / / print
ref: -0 tags: shape memory polymers neural interface thiolene date: 12-06-2013 22:55 gmt revision:0 [head]

PMID-23852172 A comparison of polymer substrates for photolithographic processing of flexible bioelectronics

  • Describe the deployment of shape-memory polymers for a neural interface
    • Thiol-ene/acrrylate network (see figures)
    • Noble metals react strongly to the thiols, yielding good adhesion.
  • Cr/Au thin films.
  • Devices change modulus as they absorb water; clever!
  • Transfer by polymerization patterning of electrodes (rather than direct sputtering).
    • This + thiol adhesion still might not be sufficient to prevent micro-cracks.
  • "Neural interfaces fabricated on thiol-ene/acrylate substrates demonstrate long-term fidelity through both in vitro impedance spectroscopy and the recording of driven local field potentials for 8 weeks in the auditory cortex of laboratory rats. "
  • Impedance decreases from 1M @ 1kHz to ~ 100k over the course of 8 weeks. Is this acceptable? Seems like the insulator is degrading (increased capacitance; they do not show phase of impedance)
  • PBS uptake @ 37C:
    • PI seems to have substantial PBS uptake -- 2%
    • PDMS the lowest -- 0.22%
    • PEN (polyethelene napathalate) -- 0.36%
    • Thiol-ene/acrylate 2.19%
  • Big problem is that during photolithographic processing all the shape-memory polymers go through Tg, and become soft/rubbery, making thin metal film adhesion difficult.
    • Wonder if you could pattern more flexible materials, e.g. carbon nanotubes (?)
  • Good paper, many useful references!

hide / / print
ref: -0 tags: DBS parkinsons dystonia review neurosurgery date: 10-05-2013 22:33 gmt revision:0 [head]

PMID-17848864 Deep brain stimulation

  • Kern DS, Kumar R. 2007
  • extensive review!

hide / / print
ref: -0 tags: polyimide platinum electrodes Spain longitudinal intrafasicular adhesion delamination date: 10-05-2013 22:24 gmt revision:4 [3] [2] [1] [0] [head]

PMID-17278585 Assessment of biocompatibility of chronically implanted polyimide and platinum intrafascicular electrodes. 2007

  • Designed platinum/polyimide longitudinal intrafasicular electrodes (LIFEs)
    • 25um PT/Ir, insulated to 60-75um diameter. PT/IR has a young's modulus of 202 Gpa.
      • Plated with platinum black under sonication, as this forms a tougher surface than without sonication.
      • See also: PMID-20485478 Improving impedance of implantable microwire multi-electrode arrays by ultrasonic electroplating of durable platinum black. Desai SA, Rolston JD, Guo L, Potter SM. 2010
    • Polyimide PI2611, 10um thick, 50mm long, 220um wide in the electrode segment.
  • Implanted into rat sciatic nerve for 3 months.
  • These electrodes have been tested in people for two days:
    • Electrical stimulation through the implanted electrodes elicited graded sensations of touch, joint movement, and position, referring to the missing limb. This suggested that peripheral nerve interfaces could be used to provide amputees with prosthetic limbs with sensory feedback and volitional control that is more natural than what is possible with current myoelectric and body-powered prostheses.
  • CMAPs = compound muscle action potentials.
  • CNAPs = compound nerve action potentials.
  • Platinum wire LIFE performed very similarly to the thin-film polyimide LIFE in most all tests, with slightly higher potentials recorded by the larger polyimide probe.
  • 'Higher encapsulation with the polyimide probes! Geometry?
  • However, the polyimide LIFEs induced less functional decline than the wire LIFEs.
  • Other polyimide studies [14] [16] [24] -- one of which they observed a 70% reduction of tensile strength after 11 months of implantation.
    • [14] F. J. Rodríguez, D. Ceballos, M. Schüttler, E. Valderrama, T. Stieglitz, and X. Navarro, “Polyimide cuff electrodes for peripheral nerve stimulation,” J. Neurosci. Meth., vol. 98, pp. 105–118, 2000.
    • [16] N. Lago, D. Ceballos, F. J. Rodríguez, T. Stieglitz, and X. Navarro, “Long term assessment of axonal regeneration through polyimide regenerative electrodes to interface the peripheral nerve,” Biomaterials, vol. 26, pp. 2021–2031, 2005.
    • [24] M. Schuettler, K. P. Koch, and T. Stieglitz, “Investigations on explanted micromachined nerve electrodes,” in Proc. 8th Annu. Int. Conf. Int. Functional Electrical Stimulation Soc., Maroochydore, Australia, 2003, pp. 306–310.
      • The technology of sandwiching a metallization layer between two layers of polyimide seems to be suitable, because no delamination of the polyimide layers was observed even after 11 months. The right choice of metals for building the electrical conductive elements of the microelectrodes is crucial. Ti/Au/Ti/Pt layers tend to flake off from polyimide while delamination of Ti/Pt layers was not observed. However, adhesion of Ti/Pt layers was investigated after 2.5 months of implantation while Ti/Au/Ti/Pt layers were exposed after 11 months to the biological system. In previous research projects, surgeons also reported on delamination of Ti/Au layers from polyimide substrate after three months. Unfortunately, we had no possibility of inspecting these microelectrodes in our laboratory.
      • See also {1250}

hide / / print
ref: -0 tags: polyimide flexible cable frontiers florida date: 10-04-2013 01:55 gmt revision:0 [head]

PMID-24062716 A highly compliant serpenti