Stretch, Then Shear. Or Shear, Then Stretch.
Take a square on a screen. First stretch it horizontally, then shear it to the right. Now reset and do those same two operations in the opposite order.
Same starting shape. Same two instructions. Different result.
That is matrix multiplication in one sentence. Matrices don't just hold numbers. They encode actions. And when you stack actions, order becomes part of the answer.
Each Entry Is One Dot Product
Forget the scary brackets for a second. Matrix multiplication is just repeated row-by-column dot products.
Suppose
The top-left entry of comes from the first row of and the first column of :
Do that for every row-column pairing and you get
That's the whole engine. No mystery. Just lots of little dot products done in a disciplined way.
The Inner Dimensions Are the Gatekeeper
Students memorize "rows times columns" and still get stuck on when multiplication is allowed. The cleaner rule is this: the inside numbers have to match.
A matrix can multiply a matrix because the 3s match. The result is . But a matrix cannot multiply a matrix in that order. The middle numbers disagree, so the row-column pairings don't line up.
I think of it as a lock and key. The first matrix provides rows. The second provides columns. If the row length and column length don't match, there is nothing to multiply.
Why Order Matters More Than The Numbers
This is where matrix multiplication stops feeling like arithmetic and starts feeling like choreography.
Let
Then
First
Then
Same matrices. Different products. Which means matrix multiplication is not commutative. in general.
That's not some annoying algebra exception. It's the point. In graphics, "rotate then scale" is different from "scale then rotate." In robotics, one movement changes the coordinate frame for the next. In machine learning, each layer transforms the data before handing it to the next layer. Order is part of the meaning.
The Quiet Engine Behind Search, Graphics, and AI
Matrix multiplication shows up anywhere lots of inputs need to be mixed into lots of outputs.
- Computer graphics: rotating, scaling, and projecting 3D objects onto a 2D screen
- Neural networks: every dense layer is a matrix multiply plus a nonlinearity
- Economics and statistics: input-output models, regression, covariance transforms
- Linear systems: row operations, inverses, and solution structure all live nearby
That's why matrix multiplication keeps pointing toward other ideas. Once you're comfortable with products, the next natural stops are determinants and reduced forms. Our guide on matrix determinants explains how a matrix scales area or volume, and the RREF guide shows how matrices expose whether a system actually has a solution.
Quick Questions
Why does the row-column rule work?
Because each output entry measures how one row of the first matrix combines with one column of the second. If those vectors are different lengths, the pairwise multiplication can't happen.
Can matrix multiplication ever be commutative?
Sometimes, but only in special cases. Identity matrices commute with everything. Some diagonal matrices commute with each other. Most matrices do not.
Is multiplying matrices the same as multiplying entries one by one?
No. Entrywise multiplication is a different operation. Standard matrix multiplication mixes rows and columns so it can represent composition of linear transformations.