Tensors generalize vectors and matrices to higher dimensions. They are fundamental in physics (general relativity, continuum mechanics) and modern machine learning. This introduction covers the essential concepts of multilinear maps, tensor products, and index notation.
A map is bilinear if:
A k-linear map is linear in each argument separately.
The determinant det: (ℝⁿ)ⁿ → ℝ is n-linear and alternating:
A bilinear map is NOT linear as a map V × W → U. For example:
Bilinear means linear in each slot separately.
If is basis for V and for W, then B(v, w) is determined by:
A bilinear form on ℝⁿ is represented by an n×n matrix.
For with :
The tensor product is a vector space with a bilinear map such that any bilinear map factors uniquely through .
If is basis for and for , then is basis for .
For , the outer product is an matrix:
For and :
The tensor product is characterized by its universal property:
For any bilinear B: V × W → U, there exists a unique linear L: V ⊗ W → U such that B(v, w) = L(v ⊗ w).
Simple tensors (or decomposable) have form v ⊗ w.
General tensors are sums of simple tensors: .
Not every tensor is simple! Most tensors require multiple terms.
In , the tensor:
cannot be written as a single u ⊗ v (verify by trying to factor).
The SVD A = UΣVᵀ expresses a matrix as sum of simple tensors:
Rank = minimum number of simple tensors needed.
Repeated indices (one upper, one lower) imply summation:
Using Einstein notation:
The index is summed (appears up and down).
Product C = AB in Einstein notation:
Index j appears up and down, so we sum over j.
The trace of a matrix:
The metric tensor converts between index types:
where is inverse of .
In Euclidean space with orthonormal basis: (Kronecker delta).
So — no distinction between upper and lower indices.
In : i is free, j is dummy.
A (p,q)-tensor is a multilinear map:
It has contravariant and covariant indices: .
Under change of basis , a (p,q)-tensor transforms as:
This transformation law DEFINES what a tensor is!
Contraction sets one upper and one lower index equal and sums:
Reduces rank by 2 (one upper, one lower index removed).
If S is (p₁,q₁) and T is (p₂,q₂), then S⊗T is (p₁+p₂, q₁+q₂):
Tensors form an algebra with:
A (0,2)-tensor is:
Any tensor can be decomposed:
For any (0,2)-tensor T:
Metric tensor, Riemann curvature, Einstein field equations—all tensor equations.
Stress tensor, strain tensor, elasticity tensor relate forces and deformations.
Data as tensors (images, sequences). Tensor decomposition for compression and analysis.
State spaces as tensor products. Entanglement is non-separability in tensor product.
The stress tensor relates force to surface orientation:
where is surface normal and A is area. It's a (0,2) symmetric tensor.
The inertia tensor relates angular momentum to angular velocity:
Diagonalizing I gives principal axes and principal moments.
The metric tensor defines inner product and distance:
In general relativity, the metric encodes gravitational effects.
Physicists often write tensors with explicit indices:
In deep learning frameworks:
The Levi-Civita symbol (in 3D):
The cross product in index notation:
The 3×3 determinant:
The Levi-Civita symbol is a pseudo-tensor:
| Type | Components | Example |
|---|---|---|
| (0,0) | 1 | Scalar |
| (1,0) | n | Vector vⁱ |
| (0,1) | n | Covector ωᵢ |
| (1,1) | n² | Linear map Aⁱⱼ |
| (0,2) | n² | Metric gᵢⱼ |
| (p,q) | n^(p+q) | General tensor |
The rank of a tensor T is the minimum number of simple tensors needed:
For matrices, this is ordinary matrix rank. For higher-order tensors, it's more complex.
SVD is a tensor decomposition for 2-tensors (matrices):
This is the CP decomposition for matrices.
A matrix is a specific representation. A tensor is defined by transformation laws. Different bases give different matrices for same tensor.
Can only contract one upper with one lower index. Tⁱʲ cannot contract with Tᵏˡ unless indices match position.
Most tensors require sums of simple tensors. e₁⊗e₁ + e₂⊗e₂ cannot be written as u⊗v.
Tensors lead to:
Let V = ℝ² with basis {e₁, e₂} and W = ℝ³ with basis {f₁, f₂, f₃}.
Basis for V⊗W (dimension 6):
Express in the basis:
For with components, compute the contraction :
In 2D with polar coordinates, the metric tensor is:
Line element:
The electromagnetic field tensor is antisymmetric (0,2)-tensor:
Maxwell's equations become tensor equations!
The wedge product (exterior product) is antisymmetric:
For and :
Note: e₂∧e₂ = 0 vanishes.
Exterior algebra is the foundation of differential forms:
| V ⊗ W | Tensor product of spaces |
| v ⊗ w | Simple tensor (outer product) |
| Tⁱⱼ | (1,1)-tensor components |
| gᵢⱼ | Metric tensor |
| εᵢⱼₖ | Levi-Civita symbol |
| δⁱⱼ | Kronecker delta |
Compute the dimension of V⊗W⊗U where dim(V)=2, dim(W)=3, dim(U)=4.
Solution: dim(V⊗W⊗U) = 2×3×4 = 24
Write the trace of A in Einstein notation.
Solution: tr(A) = Aⁱᵢ = A¹₁ + A²₂ + ... + Aⁿₙ
If Tⁱⱼ is a (1,1)-tensor and vʲ is a (1,0)-tensor, what is Tⁱⱼvʲ?
Solution: A (1,0)-tensor (vector). The j index is contracted.
Show that εⁱʲᵏεᵢⱼₖ = 6 in ℝ³.
Solution: Sum over all permutations: each nonzero term contributes ±1×(±1) = 1, and there are 6 such terms (3! = 6).
You've mastered tensors when you can:
In quantum mechanics, composite systems use tensor products:
A product state (separable) has form |ψ⟩ ⊗ |φ⟩.
An entangled state cannot be written as a single product.
The Bell state (maximally entangled):
Cannot be written as |a⟩ ⊗ |b⟩ — it's entangled!
Quantum computers exploit tensor product structure:
A tensor network is a collection of tensors with some indices contracted between them, represented as a graph.
An MPS represents a many-body state efficiently:
Each A^[k] is a matrix, and we multiply them to get coefficients.
Key developments in tensor theory:
Tensors provide a coordinate-independent way to describe physical laws. Einstein famously said the happiest thought of his life was realizing that physical laws should take the same form in all coordinate systems — which requires tensor equations.
Tensors generalize vectors and matrices to arbitrarily many indices. They provide the natural language for:
Relativity, electromagnetism, continuum mechanics — all tensor theories.
Differential geometry, algebraic topology, representation theory.
Deep learning, quantum computing, data science — all tensor computations.
From here, explore:
For matrices A (m×n) and B (p×q), the Kronecker product A⊗B is (mp×nq):
For and :
The vec operation stacks columns of a matrix into a vector:
This connects matrix equations to Kronecker products.
The Hadamard product (element-wise) for same-sized matrices:
| Library | Language | Focus |
|---|---|---|
| NumPy | Python | General arrays, einsum |
| PyTorch | Python | Deep learning, autograd |
| TensorFlow | Python | Deep learning, TPU support |
| ITensor | C++/Julia | Tensor networks, physics |
Einstein summation in NumPy:
np.einsum('ij,jk->ik', A, B)np.einsum('ii->', A)np.einsum('i,j->ij', u, v)Tensor operations can be expensive:
Tensor networks provide efficient representations!
Compute for 2×2 matrices A, B.
Solution: By Kronecker product property:
If T is a (2,1)-tensor in ℝ³, how many components does it have?
Solution: 3² × 3 = 27 components
Show that the trace is basis-independent.
Solution: Under change of basis P:
A tensor is a multilinear map from products of vector spaces (and their duals) to scalars. Equivalently, it's an array of numbers that transforms in a specific way under change of basis.
A matrix is a 2D array, specifically a (1,1)-tensor or a representation of a linear map. Tensors generalize to any number of indices and include specific transformation rules.
V ⊗ W is a new vector space such that bilinear maps from V × W correspond to linear maps from V ⊗ W. Elements v ⊗ w are 'simple tensors'; general elements are sums of these.
It compactly expresses sums over indices. Instead of Σᵢ aᵢbⁱ, write aᵢbⁱ. Repeated indices (one up, one down) imply summation. Makes tensor equations cleaner.
Contravariant (upper indices) transform opposite to basis vectors. Covariant (lower indices) transform like basis vectors. The distinction matters for non-orthonormal bases.
Stress, strain, inertia, electromagnetic field, metric in general relativity—all tensors. They express physical laws independent of coordinate choice.
Data is stored in tensors: images (H×W×C), batches (N×H×W×C), sequences (N×T×D). Deep learning frameworks (PyTorch, TensorFlow) are built around tensor operations.
Setting one upper and one lower index equal and summing: Tⁱⱼₖ → Tⁱⱼᵢ = Σᵢ Tⁱⱼᵢ. Reduces rank by 2. Matrix trace is a contraction.
The determinant can be expressed using the Levi-Civita tensor εⁱʲᵏ. det(A) = εⁱʲᵏ a₁ᵢ a₂ⱼ a₃ₖ. This connects determinants to multilinear algebra.
The cross product is actually a pseudovector (axial vector). It uses the Levi-Civita symbol and behaves differently under reflections than true vectors.