Hotelling's T², two-sample tests, MANOVA, and multivariate hypothesis testing
Test vs
Distribution under H₀
Reject H₀ when
Test (assuming equal covariance matrices)
Pooled Covariance
F-transformation
For matched pairs , compute differences
Distribution
A confidence region for :
Shape
Ellipsoid centered at
Axes
Determined by eigenvectors of
For any linear combination :
Property
All such intervals hold simultaneously with confidence
For pre-specified comparisons, use for each:
When to use
Bonferroni is tighter than simultaneous T² intervals when is small relative to
Test for groups
Within-group SS
Between-group SS
Total SS
Wilks' Lambda
Likelihood ratio statistic. Smaller → reject H₀
Pillai's Trace
Most robust to violations
Lawley-Hotelling Trace
Generalization of F-statistic
Roy's Largest Root
Largest eigenvalue of
where
Degrees of freedom
, from formula
1. Parallelism (Equal Slopes)
where is a contrast matrix
2. Equal Levels (Coincident Profiles)
(test only if parallel)
3. Flatness
(test only if coincident)
For p variables, the contrast matrix has p-1 rows:
Interpretation
gives successive differences between adjacent means
Power of T² test depends on the non-centrality parameter:
Effect Size
Mahalanobis distance:
Sample Size
Larger n increases power for fixed effect size
Multivariate Normality
Each group follows multivariate normal distribution
Homogeneity of Covariance
Independence
Observations are independent within and between groups
Random Sampling
Random samples from populations
Tests
Caution: Box's M is sensitive to non-normality. A significant result may indicate non-normality rather than unequal covariances.
Test if mean differs from hypothesized value with n=25, p=3:
Given
, n=25, p=3
Convert to F
Decision: Compare F to . Since 4.77 > 3.05, reject H₀ at α=0.05.
For large samples, T² is approximately chi-squared:
Advantage
No normality assumption required (CLT)
When to use
n > 50 or when normality is questionable
Test where is a contrast matrix:
Applications
Testing specific contrasts, comparing subsets of means, repeated measures analysis
Wilks' Lambda (Λ)
Most commonly used; ratio of error to total variance
Pillai's Trace (V)
Most robust to violations; sum of squared canonical correlations
Lawley-Hotelling Trace (U)
Powerful when groups differ on one dimension
Roy's Largest Root (θ)
Most powerful but most sensitive to violations
Default Choice
Use Pillai's Trace for robustness, especially with unequal n or assumption violations
When All Agree
If all four statistics lead to same conclusion, report Wilks' Lambda (most common)
Parallelism
Are profiles parallel? Test if slopes are equal across groups
Levels
Are profiles at same level? Test overall group means
Flatness
Are profiles flat? Test if all variables have same mean
Testing order: First test parallelism. If parallel, test levels. If not parallel, examine interaction.
Multivariate Normality
Test with Mardia's skewness/kurtosis or Q-Q plots of Mahalanobis distances
Homogeneity of Covariances
Box's M test (but sensitive to non-normality); use Pillai if violated
Independence
Random sampling; observations independent within and between groups
No Multicollinearity
Variables should not be perfectly correlated; check condition number
To Non-Normality
Fairly robust with large n (CLT); symmetric distributions less problematic than skewed
To Unequal Covariances
Less robust; use Pillai's trace or transform data
Partial Eta-Squared
Multivariate Eta-Squared
Interpretation: Small ≈ 0.01, Medium ≈ 0.06, Large ≈ 0.14 (Cohen's guidelines)
Power depends on:
Factors Increasing Power
Factors Decreasing Power
Univariate ANOVAs
Test each DV separately with Bonferroni correction
Discriminant Analysis
Identify which linear combinations of DVs distinguish groups
Stepdown Analysis
Sequential ANCOVAs controlling for prior DVs
Contrast Analysis
Test specific hypotheses about group differences
When same subjects measured at multiple times/conditions:
Advantages
More power; controls individual differences; fewer subjects needed
Sphericity Assumption
Equal variances of differences between conditions (Mauchly's test)
Greenhouse-Geisser
Conservative correction; adjust df by ε
Huynh-Feldt
Less conservative; use when ε > 0.75
R
Hotelling::hotelling.test(), MASS::manova()
Python
statsmodels.multivariate.manova
SPSS
Analyze → General Linear Model → MANOVA
SAS
PROC GLM with MANOVA statement
Profile analysis tests three questions about group profiles across repeated measures:
1. Parallelism
Do groups have similar patterns across variables?
2. Levels
Do groups differ in overall mean? (Test only if parallel)
3. Flatness
Are all variables equal across groups?
Study: Compare treatment and control groups on cognitive tests over time
Multivariate η²
Proportion of variance explained
Partial η²
Effect size for each DV separately
Cohen's Guidelines
Small: 0.01, Medium: 0.06, Large: 0.14
Mahalanobis D²
Considerations for determining sample size:
Issue: Unbalanced designs reduce power
Solution: Use Type III SS; check Box's M test for homogeneity
Issue: Listwise deletion reduces power
Solution: Use multiple imputation or maximum likelihood estimation
Issue: Outliers inflate error variance
Solution: Use Mahalanobis distance; consider robust methods
Issue: Violations affect Type I error
Solution: Use permutation tests; transformations; larger sample
Under the null hypothesis and multivariate normality:
The statistic can be written as:
Key Property
T² is invariant under affine transformations
Distribution
Exact finite-sample distribution known
T² is n times the squared Mahalanobis distance:
Interpretation: D² measures the standardized distance between sample mean and hypothesized mean, accounting for correlation structure.
Problem: Compare mean vectors of two treatments with p=3 variables
Group 1: n₁=20
Group 2: n₂=25
Step 1: Compute pooled covariance matrix
Step 2: Calculate T² statistic
Step 3: Convert to F-statistic
Critical Value
F₃,₄₁,₀.₀₅ = 2.84
Decision
2.67 < 2.84 → Fail to reject H₀ at α=0.05
Scenario: Compare three teaching methods on two outcome variables
n₁=15, n₂=15, n₃=15, p=2
Within-group SS&CP matrix: |W| = 125.0
Total SS&CP matrix: |B+W| = 180.0
Calculate Wilks' Lambda:
F-approximation:
Conclusion: F(4,40) = 4.41, p < 0.01. Significant difference among teaching methods.
For any linear combination , a confidence interval:
Critical Value
Coverage: All such intervals simultaneously contain the true values with probability 1-α
T² Method
Bonferroni Method
Non-Normal Data
Unequal Covariances
Small Samples
Outliers Present
Distribution-free alternative to T² test:
Advantage: No distributional assumptions; exact Type I error control with sufficient permutations.
Comparing two groups with unequal covariance matrices:
Challenge
No exact solution; requires approximations
Approaches
Yao's test, Nel-van der Merwe test, bootstrap methods
When p is large relative to n:
Issues
S may be singular; classical T² fails
Solutions
Regularized covariance estimation, Dempster's test, dimensionality reduction
Use MANOVA when you have multiple correlated dependent variables. It controls overall Type I error and accounts for correlations between variables.
Multivariate normality, random sampling, and (for two-sample) equal covariance matrices. It's robust to mild non-normality with large samples.
Lambda ranges from 0 to 1. Values close to 0 indicate large group differences (reject H₀); values close to 1 suggest groups are similar.
Tests homogeneity of covariance matrices across groups. Very sensitive to non-normality. If violated with unequal n, consider robust methods or separate analyses.
Keep DVs moderate (typically <10). More DVs require larger sample sizes and may reduce power. Consider theoretical justification for each DV.