m8ta
You are not authenticated, login. 

{1543} 
ref: 2019
tags: backprop neural networks deep learning coordinate descent alternating minimization
date: 07212021 03:07 gmt
revision:1
[0] [head]


Beyond Backprop: Online Alternating Minimization with Auxiliary Variables
This is interesting in that the weight updates can be cone in parallel  perhaps more efficient  but you are still propagating errors backward, albeit via optimizing 'codes'. Given the vast infractructure devoted to autodiff + backprop, I can't see this being adopted broadly. That said, the idea of alternating minimization (which is used eg for EM clustering) is powerful, and this paper does describe (though I didn't read it) how there are guarantees on the convexity of the alternating minimization. Likewise, the authors show how to improve the performance of the online / minibatch algorithm by keeping around memory variables, in the form of covariance matrices. 