Problems on PCA, factor analysis, discriminant analysis, MANOVA, clustering, and canonical correlation
Instructions
Let where:
(1) Find the distribution of X₁ + X₂.
(2) Find the conditional distribution of X₃ given X₁ = 2, X₂ = 3.
(3) Are X₁ and X₃ independent?
PCA is performed on a 5-variable dataset. The eigenvalues of the correlation matrix are: 2.8, 1.5, 0.4, 0.2, 0.1.
(1) How much variance is explained by the first PC?
(2) How many PCs would you retain using Kaiser's criterion?
(3) What percentage of variance is retained with 2 PCs?
Test H₀: for a 3-dimensional multivariate normal with n = 25 observations.
Given: , , and sample covariance S.
(1) Write the T² statistic formula.
(2) What is the null distribution of T²?
(3) How do you convert T² to an F-statistic?
Two populations with equal covariance matrices:
(1) Find Fisher's linear discriminant function.
(2) Classify a new observation assuming equal priors.
After extracting 2 factors, the unrotated loadings are:
(1) Why is rotation often applied?
(2) What is the goal of varimax rotation?
(3) Interpret the factor pattern (which variables load on which factors).
K-means clustering with k = 3 produces within-cluster sum of squares: WSS₁ = 50, WSS₂ = 45, WSS₃ = 40.
(1) Calculate total WSS.
(2) If total SS = 200, find the R² measure.
(3) Describe the silhouette coefficient and its interpretation.
Two sets of variables: and .
First canonical correlation ρ₁ = 0.85.
(1) What does canonical correlation measure?
(2) How many canonical correlations exist?
(3) Interpret ρ₁ = 0.85.
Compare 3 treatment groups on 4 response variables.
(1) Why use MANOVA instead of 4 separate ANOVAs?
(2) State the null hypothesis in MANOVA.
(3) Name three MANOVA test statistics and their properties.