notes.billmill.org / programming / neural_networks /

Neural Network zero to hero

last updated: Oct 20, 2023

https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ

A series of lectures from Andrej Karpathy building neural networks up from the ground up. Via Michael Nielsen

A hands-on explanation with a nice combination of theory and practice. Focused on language models, with the promise that it'll build up to modern transformer models.

I really enjoyed reading the transformer code here: https://github.com/karpathy/makemore/blob/master/makemore.py The key code is about ~70 lines, of which most is very straightforward / boilerplate. The core of it is maybe 20 lines - that's what does GPT-2!

↑ up