Machine Learning/Learning Center/Feature Selection & Sparse Learning/Matrix Completion

Matrix Completion

Learn how to recover missing elements in matrices using low-rank assumptions. Master the conversion from rank minimization to nuclear norm minimization, understand semidefinite programming solutions, and discover recovery conditions for perfect matrix reconstruction.

Module 8 of 8

Intermediate to Advanced

120-150 min

What is Matrix Completion?

Matrix completion is the problem of recovering a matrix from a subset of its entries. Given partial observations, we want to infer the missing values under the assumption that the matrix is low-rank.

Problem Formulation

Given a matrix $A \in \mathbb{R}^{m \times n}$ with observed entries at positions $\Omega$ , find the complete matrix $X$ such that:

X_{ij} = A_{ij} \quad \forall (i,j) \in \Omega

and $X$ has low rank (or is as low-rank as possible).

Applications

• Recommendation systems
• Collaborative filtering
• Signal recovery
• Image inpainting
• Sensor data recovery

Challenges

• Rank minimization is NP-hard
• Requires sufficient observations
• Sensitive to noise
• Computational complexity
• Recovery conditions

Rank Minimization Problem

The ideal approach is to minimize the rank of the completed matrix:

Original Problem

\min \text{rank}(X) \quad \text{s.t. } X_{ij} = A_{ij}, \quad (i,j) \in \Omega

where $\text{rank}(X)$ is the rank of matrix $X$ .

Why Rank Minimization is Hard

NP-Hard: Rank minimization is computationally intractable in general
Non-convex: Rank function is non-convex and discontinuous
Combinatorial: Requires searching over all possible ranks

Nuclear Norm: Convex Relaxation of Rank

The nuclear norm is the tightest convex relaxation of the rank function, analogous to how L1 norm relaxes L0 norm.

Nuclear Norm Definition

The nuclear norm (trace norm) of matrix $X$ is the sum of its singular values:

\|X\|_* = \sum_{j=1}^{\min(m,n)} \sigma_j(X)

where $\sigma_j(X)$ are the singular values of $X$ .

Key Properties

Convex: Nuclear norm is convex (unlike rank)
Tightest relaxation: Nuclear norm is the convex hull of rank function on the unit ball
Low-rank promoting: Minimizing nuclear norm encourages low-rank solutions
Efficient optimization: Can be solved via semidefinite programming (SDP)

Relaxed Problem

Replace rank with nuclear norm:

\min \|X\|_* \quad \text{s.t. } X_{ij} = A_{ij}, \quad (i,j) \in \Omega

This is a convex optimization problem that can be solved efficiently!

Semidefinite Programming (SDP) Solution

Nuclear norm minimization can be reformulated as a semidefinite program:

SDP Reformulation

Nuclear norm minimization is equivalent to:

\min \frac{1}{2}(\text{tr}(W_1) + \text{tr}(W_2))

subject to:

\begin{bmatrix} W_1 & X \\ X^T & W_2 \end{bmatrix} \succeq 0, \quad X_{ij} = A_{ij}, \quad (i,j) \in \Omega

where $\succeq 0$ means positive semidefinite, and $W_1, W_2$ are auxiliary variables.

Why SDP?

Convex: SDP is a convex optimization problem with polynomial-time interior-point methods
Efficient solvers: Many mature SDP solvers available (CVX, SDPT3, MOSEK)
Global optimum: Guarantees to find the global minimum
Duality: Strong duality holds, enabling dual-based algorithms

Recovery Conditions

Under what conditions can we perfectly recover a low-rank matrix from partial observations?

Main Result

If matrix $A$ has rank $r$ and observations are sampled uniformly at random, then with high probability, nuclear norm minimization perfectly recovers $A$ if:

|\Omega| \geq C \cdot r(m+n) \log^2(m+n)

where $|\Omega|$ is the number of observed entries, and $C$ is a constant (typically $C \approx 5-10$ ).

Interpretation

Near-optimal: We need roughly $O(r(m+n))$ observations, which is much less than $mn$ (full matrix)
Rank-dependent: Lower rank requires fewer observations
Logarithmic factor: The $\log^2$ factor is small compared to the linear term
Random sampling: Uniform random sampling is crucial (structured sampling may require more observations)

Practical Example: Recommendation System

Matrix completion is the foundation of collaborative filtering in recommendation systems.

Problem Setup

User-item rating matrix: 10,000 users × 5,000 movies. Only 5% of ratings are observed (sparse). Matrix is low-rank (users have similar preferences, movies have similar features).

Matrix Completion

Apply nuclear norm minimization:

• Estimated rank: $r \approx 20$
• Required observations: $\approx 300,000$ (we have 2.5M, sufficient)
• Recovery accuracy: $> 95\%$ on test set
• Prediction quality: RMSE $< 0.8$ (good for 1-5 star ratings)

Impact

Enables accurate movie recommendations for users who haven't rated many movies. Powers Netflix, Amazon, and other recommendation systems.

Key Takeaways

Matrix completion recovers missing matrix entries under the low-rank assumption, enabling applications in recommendation systems and data recovery.

Rank minimization is NP-hard, but nuclear norm (sum of singular values) provides the tightest convex relaxation.

Nuclear norm minimization can be solved via semidefinite programming (SDP), providing efficient polynomial-time algorithms.

Perfect recovery is possible with $O(r(m+n) \log^2(m+n))$ observations, much less than the full $mn$ entries.

Matrix completion is fundamental to collaborative filtering, enabling accurate recommendations from sparse user-item interactions.

The nuclear norm approach extends the L1 sparsity concept from vectors to matrices, completing the sparse learning framework.

Module Complete!