m8ta
use https for features. 

{1410}  
Structure discovery in Nonparametric Regression through Compositional Kernel Search
 
{806}  
I've recently tried to determine the bitrate of conveyed by one gaussian random process about another in terms of the signaltonoise ratio between the two. Assume $x$ is the known signal to be predicted, and $y$ is the prediction. Let's define $SNR(y) = \frac{Var(x)}{Var(err)}$ where $err = xy$ . Note this is a ratio of powers; for the conventional SNR, $SNR_{dB} = 10*log_{10 } \frac{Var(x)}{Var(err)}$ . $Var(err)$ is also known as the meansquarederror (mse). Now, $Var(err) = \sum{ (x  y  sstrch \bar{err})^2 estrch} = Var(x) + Var(y)  2 Cov(x,y)$ ; assume x and y have unit variance (or scale them so that they do), then $\frac{2  SNR(y)^{1}}{2 } = Cov(x,y)$ We need the covariance because the mutual information between two jointly Gaussian zeromean variables can be defined in terms of their covariance matrix: (see http://www.springerlink.com/content/v026617150753x6q/ ). Here Q is the covariance matrix, $Q = \left[ \array{Var(x) & Cov(x,y) \\ Cov(x,y) & Var(y)} \right]$ $MI = \frac{1 }{2 } log \frac{Var(x) Var(y)}{det(Q)}$ $Det(Q) = 1  Cov(x,y)^2$ Then $MI =  \frac{1 }{2 } log_2 \left[ 1  Cov(x,y)^2 \right]$ or $MI =  \frac{1 }{2 } log_2 \left[ SNR(y)^{1}  \frac{1 }{4 } SNR(y)^{2} \right]$ This agrees with intuition. If we have a SNR of 10db, or 10 (power ratio), then we would expect to be able to break a random variable into about 10 different categories or bins (recall stdev is the sqrt of the variance), with the probability of the variable being in the estimated bin to be 1/2. (This, at least in my mind, is where the 1/2 constant comes from  if there is gaussian noise, you won't be able to determine exactly which bin the random variable is in, hence log_2 is an overestimator.) Here is a table with the respective values, including the amplitude (not power) ratio representations of SNR. "
Now, to get the bitrate, you take the SNR, calculate the mutual information, and multiply it by the bandwidth (not the sampling rate in a discrete time system) of the signals. In our particular application, I think the bandwidth is between 1 and 2 Hz, hence we're getting 1.63.2 bits/second/axis, hence 3.26.4 bits/second for our normal 2D tasks. If you read this blog regularly, you'll notice that others have achieved 4bits/sec with one neuron and 6.5 bits/sec with dozens {271}.  
{762} 
ref: work0
tags: covariance matrix adaptation learning evolution continuous function normal gaussian statistics
date: 06302009 15:07 gmt
revision:0
[head]


http://www.lri.fr/~hansen/cmatutorial.pdf
