Why multifactor?
 Take a simple MLP. Let $x$ be the layer activation. $X^0$ is the input, $X^1$ is the second layer (first hidden layer). These are vectors, indexed like $x^a_i$ .
 Then $X^1 = W X^0$ or $x^1_j = \phi(\Sigma_{i=1}^N w_{ij} x^0_i)$ . $\phi$ is the nonlinear activation function (ReLU, sigmoid, etc.)
 In standard STDP the learning rule follows $\Delta w \propto f(x_{pre}(t), x_{post}(t))$ or if layer number is $a$ $\Delta w^{a+1} \propto f(x^a(t), x^{a+1}(t))$
 (but of course nobody thinks there 'numbers' on the 'layers' of the brain  this is just referring to pre and post synaptic).
 In an artificial neural network, $\Delta w^a \propto  \frac{\partial E}{\partial w_{ij}^a} \propto  \delta_{j}^a x_{i}$ (Intuitively: the weight change is proportional to the error propagated from higher layers times the input activity) where $\delta_{j}^a = (\Sigma_{k=1}^{N} w_{jk} \delta_k^{a+1}) \partial \phi$ where $\partial \phi$ is the derivative of the nonlinear activation function, evaluated at a given activation.
 $f(i, j) \rightarrow [x, y, \theta, \phi]$
 $k = 13.165$
 $x = round(i / k)$
 $y = round(j / k)$
 $\theta = a (\frac{i}{k}  x) + b (\frac{i}{k}  x)^2$
 $\phi = a (\frac{j}{k}  y) + b (\frac{j}{k}  y)^2$
