Linear Algebra for AI
Master the language of AI: vectors, matrices, transformations, and eigenvalues -- with Python code to ground every concept. From NumPy basics to LoRA fine-tuning and mechanistic interpretability.
About This Course
Linear algebra is the mathematical backbone of modern AI and machine learning. This course teaches you to think in vectors and matrices, with every concept grounded in executable Python code.
You won't just memorize formulas -- you'll build intuition for what linear transformations actually do, visualize high-dimensional spaces, and understand why eigenvalues matter for everything from Google's PageRank to neural networks.
The course bridges theory and production: you'll see how SVD powers LoRA fine-tuning, how embeddings live in vector spaces, how attention is pure linear algebra, and how quantization trades precision for speed. Every module includes a Python mini-lab so you can manipulate the math and see the result change.
The AI tutor can verify its own math through code execution, making this subject especially well-suited for reliable, interactive learning.
Inspired by Gilbert Strang's MIT 18.06, 3Blue1Brown's Essence of Linear Algebra, and modern AI research (LoRA, mechanistic interpretability, quantization).
Prerequisites
- Basic algebra (solving equations, working with variables)
- Familiarity with Python basics helpful but not required
- No prior linear algebra experience needed
What You Will Learn
- Understand vectors, matrices, and their geometric interpretations
- Visualize transformations with matplotlib and build geometric intuition
- Solve systems of linear equations using Gaussian elimination
- Grasp linear independence, span, and basis -- the core of vector spaces
- Compute and interpret determinants, inverses, eigenvalues, and eigenvectors
- Apply PCA for dimensionality reduction on real datasets
- Use SVD for image compression and low-rank approximation
- Understand how word embeddings and LLM embeddings work geometrically
- Know how ANN algorithms (HNSW, IVF) power vector search at scale
- Read neural network architectures as chains of matrix operations
- Understand the attention mechanism (QKV) as pure linear algebra
- Explain how LoRA compresses fine-tuning via low-rank factorization
- Grasp how mechanistic interpretability uses linear directions to decode model behavior
- Understand quantization as an affine transformation trading precision for speed
Your Learning Path
Each module builds on the last. Take your time—the AI tutor is with you at every step.
Vectors: The Language of Data — From arrows to arrays -- how AI represents everything as vectors
Vectors are the atoms of linear algebra and the native data format of AI. This module builds your intuition from geometric arrows to NumPy arrays. You'll learn vector addition, scalar multiplication, and the dot product -- then discover that dot products measure similarity, the idea behind cosine similarity in recommendation systems and LLM embeddings. Visualize everything with matplotlib so the geometry clicks before the formulas. Mini-lab: Compute cosine similarity between sentence embeddings to find the most similar document in a small collection.
Matrices as Transformations — Every matrix is a function that reshapes space
Matrices aren't just grids of numbers -- they are linear transformations. This module teaches matrix addition, multiplication, and transposition, but always through the lens of geometry: rotation, scaling, shearing, and projection. You'll visualize each operation with matplotlib, watching the unit square deform in real time. This geometric framing is essential because every layer of a neural network is a matrix transformation. Mini-lab: Build an interactive 2D transformation visualizer -- pick a matrix, watch the grid warp.
Systems of Linear Equations — Gaussian elimination and the art of solving Ax = b
Most of applied math reduces to solving Ax = b. This module teaches you Gaussian elimination and row reduction -- the algorithmic backbone of linear algebra. You'll see how systems of equations correspond to intersecting hyperplanes, understand when solutions exist (and when they don't), and implement row reduction in Python. Mini-lab: Solve a system of 5 equations by hand via row reduction, then verify with np.linalg.solve.
Vector Spaces and Subspaces — Linear independence, span, basis, and dimension
This module introduces the abstract structure that unifies all of linear algebra: vector spaces. You'll learn what it means for vectors to be linearly independent, what 'span' really means, and why choosing the right basis simplifies everything. These concepts are the vocabulary you need to understand PCA, SVD, and embedding spaces. Mini-lab: Find a basis for the column space of a matrix and verify that non-basis vectors can be written as linear combinations.
Determinants and Inverses — The scaling factor of transformations and when you can undo them
The determinant tells you how a matrix transformation scales area (or volume). If it's zero, the transformation crushes space into a lower dimension -- and the matrix has no inverse. This module builds geometric intuition for determinants, then connects to matrix invertibility. You'll compute determinants and inverses by hand and with NumPy. Mini-lab: Visualize how the determinant changes as you continuously deform a 2x2 matrix from invertible to singular.
Linear Transformations — Kernel, image, rank-nullity, and change of basis
Now that you know matrices are transformations, this module digs deeper: what gets sent to zero (the kernel), what the transformation can produce (the image), and the fundamental rank-nullity theorem that connects them. You'll also learn change of basis -- the technique behind diagonalization, PCA, and every 'feature extraction' pipeline in ML. Mini-lab: Compute the kernel and image of a transformation, verify the rank-nullity theorem, and perform a change of basis.
Eigenvalues and Eigenvectors — The directions that survive a transformation unchanged
Eigenvectors are the directions a matrix only stretches (never rotates). Eigenvalues tell you how much. This module builds geometric intuition first -- watching vectors get transformed and identifying the special ones that stay on their line -- then covers the characteristic polynomial and diagonalization. The payoff: eigenvalues are the key to PCA, Google's PageRank, and stability analysis. Mini-lab: Animate a 2x2 transformation showing all vectors being rotated except the eigenvectors, which only scale.
PCA: Dimensionality Reduction — Finding the directions of maximum variance in your data
Principal Component Analysis is eigenvalues applied to data. You compute the covariance matrix of your dataset, find its eigenvectors (the principal components), and project onto the top-k directions of maximum variance. This module is the bridge from abstract eigenvalue theory to practical ML: you'll reduce a real dataset from high dimensions to 2D and visualize the clusters that emerge. Mini-lab: Run PCA on the Iris dataset -- reduce 4 features to 2, plot the result, and see the species separate.
Singular Value Decomposition — The Swiss army knife of matrix decompositions
SVD decomposes any matrix (not just square ones) into three factors: U * Sigma * V^T. The singular values in Sigma tell you how much 'information' each component carries. By keeping only the top-k singular values, you get the best rank-k approximation -- this is the mathematical foundation of image compression, latent semantic analysis, and (crucially) LoRA fine-tuning. This module builds from the geometry of SVD to hands-on applications. Mini-lab: Compress a photo by keeping only the top-k singular values. Watch the image degrade as you reduce k from 100 to 10 to 1.
Embeddings: From Words to Vectors — How AI maps discrete objects into continuous vector spaces
An embedding is a learned linear map from a discrete set (words, users, products) into a continuous vector space. This module covers the geometry of embeddings: why king - man + woman = queen works, how sentence embeddings capture semantic meaning, and what it means for LLM token embeddings to live in a 12,288-dimensional space. You'll also confront the curse of dimensionality -- why intuition breaks in high dimensions. Mini-lab: Build a semantic search engine. Embed a collection of sentences, compute cosine similarities, and retrieve the most relevant document for a query.
Vector Search at Scale — ANN algorithms, HNSW, and the rise of vector databases
Once you have millions of embeddings, brute-force cosine similarity is too slow. This module covers approximate nearest neighbor (ANN) algorithms that trade a tiny bit of accuracy for massive speedups. You'll learn locality-sensitive hashing, IVF (inverted file index), HNSW (hierarchical navigable small world graphs), and product quantization. Then see how vector databases (pgvector, Qdrant, Pinecone) wrap these algorithms into production infrastructure for RAG and recommendation systems. Mini-lab: Index 10,000 embeddings with FAISS, compare brute-force vs. IVF vs. HNSW on speed and recall.
Neural Networks as Linear Algebra — Dense layers, forward passes, and tensors beyond 2D
A dense neural network layer is just y = Wx + b -- a matrix multiplication plus a bias vector, followed by a nonlinear activation. This module strips away the deep learning mystique and shows you the linear algebra at the core. You'll manually implement a forward pass using only NumPy, then extend to tensors (3D+ arrays) that represent batches of images and sequences. Finally, you'll see that PyTorch's torch.Tensor is just NumPy with autograd. Mini-lab: Manually code a 2-layer neural network forward pass using only NumPy matrix operations (no frameworks), then verify against PyTorch.
The Attention Mechanism — How transformers use matrix operations to focus on what matters
Attention is the innovation behind transformers, and it's pure linear algebra. This module derives scaled dot-product attention step by step: project inputs into Query, Key, and Value matrices, compute attention scores via QK^T/sqrt(d), apply softmax, then multiply by V. You'll see that multi-head attention is just running several smaller attention operations in parallel -- a block-diagonal matrix structure. No black boxes. Mini-lab: Implement single-head and multi-head attention from scratch in NumPy. Feed in a sentence and visualize the attention weight matrix as a heatmap.
LoRA: Low-Rank Adaptation — Fine-tuning billion-parameter models with tiny matrices
LoRA is the most practical application of matrix decomposition in modern AI. Instead of updating a giant weight matrix W (d x k parameters), you freeze W and train two small matrices B (d x r) and A (r x k) where r << d, so the update is Delta-W = B*A. This module connects SVD theory to practice: you'll understand why weight updates tend to be low-rank, implement a LoRA layer from scratch, and see how this enables fine-tuning LLMs on consumer hardware. Mini-lab: Implement a LoRA adapter layer. Compare parameter counts: full fine-tuning vs. LoRA with rank 4, 8, 16. Show that rank-8 captures 95%+ of the update.
Mechanistic Interpretability — Using linear algebra to reverse-engineer what neural networks learn
Mechanistic interpretability is the 'forensics' of AI: researchers use linear algebra to decode what individual neurons and layers actually represent. The key insight is the Linear Representation Hypothesis -- high-level concepts like 'truthfulness,' 'sentiment,' or 'programming language' are encoded as linear directions (vectors) in activation space. You can find these directions, measure them, and even add or subtract them to steer model behavior. This module covers probing classifiers, activation steering, and concept vectors. Mini-lab: Extract activations from a small language model, find the 'sentiment direction' using PCA on positive vs. negative examples, and show that adding this direction flips the model's output sentiment.
Quantization: Precision vs. Speed — How shrinking numbers from 32 bits to 4 bits keeps AI fast and cheap
As models grow to hundreds of billions of parameters, storing every weight in 32-bit float becomes impractical. Quantization maps continuous values to a smaller discrete set using affine transformations: Q = round(W/S + Z). This module covers the linear algebra of quantization: why it's an affine transformation, how it distorts the vector space, why 'outlier features' break naive approaches, and how techniques like GPTQ, AWQ, and SmoothQuant handle this. You'll implement basic quantization and measure the accuracy/speed tradeoff. Mini-lab: Quantize a weight matrix to INT8 and INT4. Measure the reconstruction error (Frobenius norm of W - W_quantized) and see where outlier features cause problems.