Neurons, layers, and backprop — wired by hand
A weighted sum, a nonlinearity, a prediction.
The chain rule, made mechanical.
Chain rule across arbitrarily deep networks.
Derive backward for a 2-layer MLP by hand — checked live against finite differences.
A full multi-layer perceptron in pure NumPy.
Xavier, He, and the math of exploding gradients.