Multilinear Algebra & Applications
This final course introduces multilinear algebra and tensors, which generalize vectors and matrices to higher dimensions. We then explore the vast applications of linear algebra in machine learning, computer graphics, quantum computing, network analysis, and many other fields.
Tensors were introduced by Woldemar Voigt (1850–1919) in the 1890s and developed by Gregorio Ricci-Curbastro (1853–1925) and Tullio Levi-Civita (1873–1941). Albert Einstein (1879–1955) used tensor calculus in general relativity. The modern applications of linear algebra have exploded with the digital age: machine learning relies heavily on matrix operations, Larry Page and Sergey Brin used eigenvectors for PageRank in 1998, and quantum computing is fundamentally built on linear algebra over complex numbers. Today, linear algebra is the mathematical foundation of data science, AI, graphics, and many other fields.
Tensors generalize vectors and matrices to higher dimensions. They are fundamental in physics, differential geometry, and modern machine learning.
A function is multilinear if it is linear in each argument when the others are held fixed.
A bilinear map is multilinear with .
The tensor product is a vector space such that bilinear maps from correspond to linear maps from .
Elements are called simple tensors. General elements are sums of these.
If and , then .
A (p,q)-tensor is a multilinear map from to .
Equivalently, it's an array of numbers with contravariant (upper) indices and covariant (lower) indices.
When an index appears twice, once as upper and once as lower, summation over that index is implied.
Example: .
Tensors are characterized by how their components transform under change of basis. This makes them coordinate-independent objects.
Linear algebra is the mathematical foundation of machine learning. Neural networks, data representation, and optimization all rely heavily on matrix operations.
A fully connected layer computes where:
Data in ML is stored in tensors:
Training neural networks uses gradient descent, which requires computing gradients of matrix-valued functions. The chain rule involves matrix multiplication and transposition.
PCA finds directions of maximum variance in data. It uses SVD of the centered data matrix. Principal components are right singular vectors, and explained variance comes from squared singular values.
Every transformation in computer graphics is a matrix operation. Understanding linear algebra is essential for 3D rendering, animation, and image processing.
Common transformations:
Vertex transformation: where:
Modern GPUs are optimized for matrix operations. Graphics shaders perform millions of matrix multiplications per second to render scenes in real-time.
Graphs and networks can be represented by matrices. Eigenvalues and eigenvectors reveal important structural properties.
For a graph with vertices, the adjacency matrix has:
PageRank models the web as a directed graph. The transition matrix has:
PageRank is the dominant eigenvector (eigenvalue 1) of , representing the stationary distribution of a random walk.
For an irreducible stochastic matrix, the eigenvalue 1 is simple and the corresponding eigenvector has all positive entries.
A Markov chain is a random process where the next state depends only on the current state. Transition probabilities form a stochastic matrix. The long-run distribution is an eigenvector with eigenvalue 1.
Quantum mechanics is fundamentally linear algebraic. Quantum states are vectors, operations are unitary matrices, and measurement involves projections.
A qubit is a quantum state where .
This is a unit vector in .
A quantum gate is a unitary matrix (satisfying ).
Unitary matrices preserve the norm, ensuring probability conservation.
Important quantum gates include the Pauli matrices:
Quantum measurement projects the state onto an eigenspace. The probability of outcome is .
Linear algebra appears in countless other fields. Here are a few more important examples.
Convolution is a linear operation, represented by a Toeplitz or circulant matrix. The Fourier transform is a change of basis to the frequency domain. Filtering is projection onto frequency subspaces.
System stability is determined by eigenvalues of the state matrix. Stable systems have all eigenvalues with negative real parts (continuous) or magnitude less than 1 (discrete).
Lattice-based cryptography uses linear algebra over integers. Error-correcting codes use matrices over finite fields. Matrix groups provide structure for cryptographic protocols.
Quantum mechanics: operators are matrices, states are vectors. Mechanics: inertia tensors, stress tensors. Relativity: metric tensors, Lorentz transformations. Linear algebra is the language of physics.
A tensor is a multilinear map from products of vector spaces (and their duals) to scalars. Equivalently, it's an array of numbers that transforms in a specific way under change of basis. Scalars are (0,0)-tensors, vectors are (1,0)-tensors, and matrices are (1,1)-tensors.
ML is fundamentally about learning transformations from data. Neural networks are compositions of linear transformations (matrices) and nonlinear activations. Training uses gradient-based optimization, which relies on linear algebra. Data is stored in tensors (multi-dimensional arrays).
Model the web as a directed graph. The transition matrix has P_ij = 1/(out-degree of j) if j links to i. PageRank is the dominant eigenvector: the stationary distribution of a random walk on the web. It's computed using the power method.
Every transformation—rotation, scaling, translation, projection—is a matrix. Rendering pipelines multiply vertices by model, view, and projection matrices. Shaders perform matrix operations on GPU. 3D rotations are orthogonal matrices with determinant +1.
Quantum states are vectors in Hilbert space. Quantum gates are unitary matrices. Measurement involves projections. The entire theory is built on complex linear algebra. A qubit is a unit vector in ℂ², and quantum operations preserve the norm.