LA-C12

Available

Course 12: Advanced Topics Part 2

Multilinear Algebra & Applications

This final course introduces multilinear algebra and tensors, which generalize vectors and matrices to higher dimensions. We then explore the vast applications of linear algebra in machine learning, computer graphics, quantum computing, network analysis, and many other fields.

12-15 hours Advanced Level 10 Objectives

Learning Objectives

Define multilinear maps and tensor products.
Understand tensors as multilinear maps and their transformation properties.
Use Einstein summation notation for tensor operations.
Distinguish covariant and contravariant indices.
Apply linear algebra to machine learning and neural networks.
Understand linear algebra in computer graphics and transformations.
Model networks and graphs with adjacency matrices.
See connections to quantum computing and quantum mechanics.
Apply SVD and other techniques to image processing.
Understand Markov chains, PageRank, and other applications.

Prerequisites

LA-C11: Advanced Topics Part 1 (SVD & Quadratic Forms)
Dual spaces and linear functionals
Bilinear forms
All previous linear algebra concepts
Basic understanding of applications

Historical Context

Tensors were introduced by Woldemar Voigt (1850–1919) in the 1890s and developed by Gregorio Ricci-Curbastro (1853–1925) and Tullio Levi-Civita (1873–1941). Albert Einstein (1879–1955) used tensor calculus in general relativity. The modern applications of linear algebra have exploded with the digital age: machine learning relies heavily on matrix operations, Larry Page and Sergey Brin used eigenvectors for PageRank in 1998, and quantum computing is fundamentally built on linear algebra over complex numbers. Today, linear algebra is the mathematical foundation of data science, AI, graphics, and many other fields.

1. Multilinear Algebra and Tensors

Tensors generalize vectors and matrices to higher dimensions. They are fundamental in physics, differential geometry, and modern machine learning.

Definition 1.1: Multilinear Map

A function $B: V_1 \times V_2 \times \cdots \times V_k \to U$ is multilinear if it is linear in each argument when the others are held fixed.

A bilinear map is multilinear with $k = 2$ .

Definition 1.2: Tensor Product

The tensor product $V \otimes W$ is a vector space such that bilinear maps from $V \times W$ correspond to linear maps from $V \otimes W$ .

Elements $v \otimes w$ are called simple tensors. General elements are sums of these.

Theorem 1.1: Dimension of Tensor Product

If $\dim V = m$ and $\dim W = n$ , then $\dim(V \otimes W) = mn$ .

Definition 1.3: Tensor

A (p,q)-tensor is a multilinear map from $(V^*)^p \times V^q$ to $F$ .

Equivalently, it's an array of numbers with $p$ contravariant (upper) indices and $q$ covariant (lower) indices.

Example 1.1: Examples of Tensors

Scalar: (0,0)-tensor
Vector: (1,0)-tensor
Linear functional: (0,1)-tensor
Matrix/linear map: (1,1)-tensor
Bilinear form: (0,2)-tensor

Definition 1.4: Einstein Summation Convention

When an index appears twice, once as upper and once as lower, summation over that index is implied.

Example: $a_i b^i = \sum_i a_i b^i$ .

Remark 1.1: Transformation Law

Tensors are characterized by how their components transform under change of basis. This makes them coordinate-independent objects.

2. Applications in Machine Learning

Linear algebra is the mathematical foundation of machine learning. Neural networks, data representation, and optimization all rely heavily on matrix operations.

Definition 2.1: Neural Network Layer

A fully connected layer computes $y = \sigma(Wx + b)$ where:

$W$ is a weight matrix
$x$ is the input vector
$b$ is a bias vector
$\sigma$ is a nonlinear activation function

Example 2.1: Data Representation

Data in ML is stored in tensors:

Image: $H \times W \times C$ (height × width × channels)
Batch: $N \times H \times W \times C$ (batch size × image dimensions)
Sequence: $N \times T \times D$ (batch × time steps × features)

Remark 2.1: Gradient Descent

Training neural networks uses gradient descent, which requires computing gradients of matrix-valued functions. The chain rule involves matrix multiplication and transposition.

Example 2.2: Principal Component Analysis (PCA)

PCA finds directions of maximum variance in data. It uses SVD of the centered data matrix. Principal components are right singular vectors, and explained variance comes from squared singular values.

3. Computer Graphics

Every transformation in computer graphics is a matrix operation. Understanding linear algebra is essential for 3D rendering, animation, and image processing.

Definition 3.1: 3D Transformations

Common transformations:

Rotation: Orthogonal matrix with $\det = 1$ (SO(3))
Scaling: Diagonal matrix
Translation: Affine transformation (homogeneous coordinates)
Projection: Perspective or orthogonal projection matrix

Example 3.1: Rendering Pipeline

Vertex transformation: $v_{screen} = P \cdot V \cdot M \cdot v_{model}$ where:

$M$ = model matrix (object to world)
$V$ = view matrix (world to camera)
$P$ = projection matrix (camera to screen)

Remark 3.1: GPU Computing

Modern GPUs are optimized for matrix operations. Graphics shaders perform millions of matrix multiplications per second to render scenes in real-time.

4. Network Analysis and PageRank

Graphs and networks can be represented by matrices. Eigenvalues and eigenvectors reveal important structural properties.

Definition 4.1: Adjacency Matrix

For a graph with $n$ vertices, the adjacency matrix $A$ has:

A_{ij} = \begin{cases} 1 & \text{if edge from } i \text{ to } j \\ 0 & \text{otherwise} \end{cases}

Definition 4.2: PageRank

PageRank models the web as a directed graph. The transition matrix $P$ has:

P_{ij} = \begin{cases} 1/\text{out-degree}(j) & \text{if } j \text{ links to } i \\ 0 & \text{otherwise} \end{cases}

PageRank is the dominant eigenvector (eigenvalue 1) of $P$ , representing the stationary distribution of a random walk.

Theorem 4.1: Perron-Frobenius

For an irreducible stochastic matrix, the eigenvalue 1 is simple and the corresponding eigenvector has all positive entries.

Example 4.1: Markov Chains

A Markov chain is a random process where the next state depends only on the current state. Transition probabilities form a stochastic matrix. The long-run distribution is an eigenvector with eigenvalue 1.

5. Quantum Computing

Quantum mechanics is fundamentally linear algebraic. Quantum states are vectors, operations are unitary matrices, and measurement involves projections.

Definition 5.1: Qubit

This is a unit vector in $\mathbb{C}^2$ .

Definition 5.2: Quantum Gate

A quantum gate is a unitary matrix $U$ (satisfying $U^* U = I$ ).

Unitary matrices preserve the norm, ensuring probability conservation.

Example 5.1: Pauli Matrices

Important quantum gates include the Pauli matrices:

\sigma_x = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, \quad \sigma_y = \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix}, \quad \sigma_z = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}

Remark 5.1: Measurement

Quantum measurement projects the state onto an eigenspace. The probability of outcome $i$ is $|\langle \psi | e_i \rangle|^2$ .

6. Other Applications

Linear algebra appears in countless other fields. Here are a few more important examples.

Example 6.1: Signal Processing

Convolution is a linear operation, represented by a Toeplitz or circulant matrix. The Fourier transform is a change of basis to the frequency domain. Filtering is projection onto frequency subspaces.

Example 6.2: Control Theory

System stability is determined by eigenvalues of the state matrix. Stable systems have all eigenvalues with negative real parts (continuous) or magnitude less than 1 (discrete).

Example 6.3: Cryptography

Lattice-based cryptography uses linear algebra over integers. Error-correcting codes use matrices over finite fields. Matrix groups provide structure for cryptographic protocols.

Example 6.4: Physics

Quantum mechanics: operators are matrices, states are vectors. Mechanics: inertia tensors, stress tensors. Relativity: metric tensors, Lorentz transformations. Linear algebra is the language of physics.

Course 12 Practice Quiz

Questions

Correct

Accuracy

A bilinear map

B: V \times W \to U

is linear in:

Easy

Not attempted

\dim V = m

and

\dim W = n

, then

\dim(V \otimes W) =

Medium

Not attempted

The outer product of column vectors

u

and

v

is:

Easy

Not attempted

In machine learning, weights of a neural network layer are represented as:

Easy

Not attempted

PageRank algorithm finds:

Medium

Not attempted

In computer graphics, a 3D rotation is represented by:

Medium

Not attempted

In quantum computing, a qubit state is:

Medium

Not attempted

The adjacency matrix of a graph has

A_{ij} = 1

iff:

Easy

Not attempted

PCA finds directions of:

Easy

Not attempted

Einstein summation convention means:

Medium

Not attempted

Frequently Asked Questions

What is a tensor?

A tensor is a multilinear map from products of vector spaces (and their duals) to scalars. Equivalently, it's an array of numbers that transforms in a specific way under change of basis. Scalars are (0,0)-tensors, vectors are (1,0)-tensors, and matrices are (1,1)-tensors.

Why is linear algebra so important in machine learning?

ML is fundamentally about learning transformations from data. Neural networks are compositions of linear transformations (matrices) and nonlinear activations. Training uses gradient-based optimization, which relies on linear algebra. Data is stored in tensors (multi-dimensional arrays).

How does Google's PageRank work?

Model the web as a directed graph. The transition matrix has P_ij = 1/(out-degree of j) if j links to i. PageRank is the dominant eigenvector: the stationary distribution of a random walk on the web. It's computed using the power method.

How is linear algebra used in computer graphics?

Every transformation—rotation, scaling, translation, projection—is a matrix. Rendering pipelines multiply vertices by model, view, and projection matrices. Shaders perform matrix operations on GPU. 3D rotations are orthogonal matrices with determinant +1.

What makes quantum computing linear algebraic?

Quantum states are vectors in Hilbert space. Quantum gates are unitary matrices. Measurement involves projections. The entire theory is built on complex linear algebra. A qubit is a unit vector in ℂ², and quantum operations preserve the norm.