- Learning CUDA by optimizing matrix-vector multiplication (SGEMV) for cuBLAS-like performance - A worklog
- Contemplative LLMs: Anxiety is all you need?
- Learning CUDA by optimizing softmax: A worklog
- Tensors from Scratch #2: Elementwise operations and Broadcasting
- Tensors from Scratch #1: Tensors and their Shapes
- Be multidimensional, anon
- Informal approach to attention mechanisms