Series Β· πŸ”’ Private

Transformer Architectures

From the 2017 paper to today's LLMs. Self-attention and the QKV trio, the GPT decoder-only branch, RLHF and alignment, and the modern transformer anatomy you'd actually fine-tune.

5 lessons~21 min read