Kolmogorov Arnold Representation Theorem
The most surprizing aspect of neural networks is their simplicity. I don’t mean that the whole of a neural network is simple. It is not. But at any given instant you are either adding numbers or applying a function on one number (a single variable function). Why these two kinds of operations will give you any old function of multiple variables is a mystry to me! Let me take a small step towards understanding this by reading this really old paper by Kolmogrov....