Predictive Coding Approximates Backprop along Arbitrary Computation Graphs (Paper Explained)
Yannic Kilcher
#ai #biology #neuroscience
Backpropagation is the workhorse of modern deep learning and a core component of most frameworks, but it has long been known that it is not biologically plausible, driving a divide between neuroscience and machine learning. This paper shows that Predictive Coding, a much more biologically plausible algorithm, can approximate Backpropagation for any computation graph, which they verify experimentally by building and training CNNs and LSTMs using Predictive Coding. This suggests that the brain and deep neural networks could be much more similar than previously believed.
OUTLINE: 0:00 - Intro & Overview 3:00 - Backpropagation & Biology 7:40 - Experimental Results 8:40 - Predictive Coding 29:00 - Pseudocode 32:10 - Predictive Coding approximates Backprop 35:00 - Hebbian Updates 36:35 - Code Walkthrough 46:30 - Conclusion & Comments
Paper: https://arxiv.org/abs/2006.04182 Code: https://github.com/BerenMillidge/PredictiveCodingBackprop
Abstract: Backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. However, backprop is often criticised for lacking biological plausibility. Recently, it has been shown that backprop in multilayer-perceptrons (MLPs) can be approximated using predictive coding, a biologically-plausible process theory of cortical computation which relies only on local and Hebbian updates. The power of backprop, however, lies not in its instantiation in MLPs, but rather in the concept of automatic differentiation which allows for the optimisation of any differentiable program expressed as a computation graph. Here, we demonstrate that predictive coding converges asymptotically (and in practice rapidly) to exact backprop gradients on arbitrary computation graphs using only local learning rules. We apply this result to develop a straightforward strategy to translate core machine learning architectures into their predictive coding equivalents. We construct predictive coding CNNs, RNNs, and the more complex LSTMs, which include a non-layer-like branching internal graph structure and multiplicative interactions. Our models perform equivalently to backprop on challenging machine learning benchmarks, while utilising only local and (mostly) Hebbian plasticity. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry, and may also contribute to the development of completely distributed neuromorphic architectures.
Authors: Beren Millidge, Alexander Tschantz, Christopher L. Buckley
Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kil ... https://www.youtube.com/watch?v=LB4B5FYvtdI
339514594 Bytes