LoRA: Low-Rank Adaptation

Fine-tuning billion-parameter models with tiny matrices

LoRA is the most practical application of matrix decomposition in modern AI. Instead of updating a giant weight matrix W (d x k parameters), you freeze W and train two small matrices B (d x r) and A (r x k) where r << d, so the update is Delta-W = B*A. This module connects SVD theory to practice: you'll understand why weight updates tend to be low-rank, implement a LoRA layer from scratch, and see how this enables fine-tuning LLMs on consumer hardware. Mini-lab: Implement a LoRA adapter layer. Compare parameter counts: full fine-tuning vs. LoRA with rank 4, 8, 16. Show that rank-8 captures 95%+ of the update.

Estimated time: 60 minutes

Stuck on something? The AI tutor sees this lecture—just ask.

Loading learning experience...