MathIsimple
Back to Statistics Hub
Multivariate Statistics

Multivariate Statistics Practice

Problems on PCA, factor analysis, discriminant analysis, MANOVA, clustering, and canonical correlation

8 Problems
Suggested: 2 hours

Instructions

  • • Try to solve each problem before viewing the solution
  • • Click "Show Solution" to reveal the answer and detailed explanation
  • • Focus on understanding the problem-solving methodology
1Multivariate Normal Distribution
Problem

Let X=(X1,X2,X3)TN(μ,Σ)\mathbf{X} = (X_1, X_2, X_3)^T \sim N(\boldsymbol{\mu}, \Sigma) where:

μ=(123),Σ=(420291011)\boldsymbol{\mu} = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix}, \quad \Sigma = \begin{pmatrix} 4 & 2 & 0 \\ 2 & 9 & 1 \\ 0 & 1 & 1 \end{pmatrix}

(1) Find the distribution of X₁ + X₂.

(2) Find the conditional distribution of X₃ given X₁ = 2, X₂ = 3.

(3) Are X₁ and X₃ independent?

2Principal Component Analysis Application
Problem

PCA is performed on a 5-variable dataset. The eigenvalues of the correlation matrix are: 2.8, 1.5, 0.4, 0.2, 0.1.

(1) How much variance is explained by the first PC?

(2) How many PCs would you retain using Kaiser's criterion?

(3) What percentage of variance is retained with 2 PCs?

3Hotelling's T² Test
Problem

Test H₀: μ=μ0\boldsymbol{\mu} = \boldsymbol{\mu}_0 for a 3-dimensional multivariate normal with n = 25 observations.

Given: xˉ=(5,7,9)T\bar{\mathbf{x}} = (5, 7, 9)^T, μ0=(4,6,8)T\boldsymbol{\mu}_0 = (4, 6, 8)^T, and sample covariance S.

(1) Write the T² statistic formula.

(2) What is the null distribution of T²?

(3) How do you convert T² to an F-statistic?

4Discriminant Analysis
Problem

Two populations with equal covariance matrices:

μ1=(23),μ2=(41),Σ=(2112)\boldsymbol{\mu}_1 = \begin{pmatrix} 2 \\ 3 \end{pmatrix}, \quad \boldsymbol{\mu}_2 = \begin{pmatrix} 4 \\ 1 \end{pmatrix}, \quad \Sigma = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}

(1) Find Fisher's linear discriminant function.

(2) Classify a new observation x0=(3,2)T\mathbf{x}_0 = (3, 2)^T assuming equal priors.

5Factor Analysis Rotation
Problem

After extracting 2 factors, the unrotated loadings are:

L=(0.80.40.70.50.60.30.50.6)L = \begin{pmatrix} 0.8 & 0.4 \\ 0.7 & 0.5 \\ 0.6 & -0.3 \\ 0.5 & -0.6 \end{pmatrix}

(1) Why is rotation often applied?

(2) What is the goal of varimax rotation?

(3) Interpret the factor pattern (which variables load on which factors).

6Cluster Analysis Validation
Problem

K-means clustering with k = 3 produces within-cluster sum of squares: WSS₁ = 50, WSS₂ = 45, WSS₃ = 40.

(1) Calculate total WSS.

(2) If total SS = 200, find the R² measure.

(3) Describe the silhouette coefficient and its interpretation.

7Canonical Correlation
Problem

Two sets of variables: X=(X1,X2)\mathbf{X} = (X_1, X_2) and Y=(Y1,Y2)\mathbf{Y} = (Y_1, Y_2).

First canonical correlation ρ₁ = 0.85.

(1) What does canonical correlation measure?

(2) How many canonical correlations exist?

(3) Interpret ρ₁ = 0.85.

8MANOVA vs Multiple ANOVAs
Problem

Compare 3 treatment groups on 4 response variables.

(1) Why use MANOVA instead of 4 separate ANOVAs?

(2) State the null hypothesis in MANOVA.

(3) Name three MANOVA test statistics and their properties.

Ask AI ✨