MathIsimple
Distribution-Free Methods
6-8 Hours

Nonparametric Hypothesis Testing

Master distribution-free statistical tests for robust hypothesis testing without strict distributional assumptions

Key Concepts & Definitions

Essential terminology and mathematical foundations for nonparametric testing

Nonparametric Test

Statistical hypothesis test that does not rely on specific distributional assumptions about the population, using only rank, sign, or frequency information from sample data.

Test statistic based on ranks, signs, or frequencies\text{Test statistic based on ranks, signs, or frequencies}
Sign Test Statistic (N⁺)

Count of positive differences or values above the hypothesized median in sign test procedures.

N+=i=1nI(Xit0>0)B(n,θ)N^+ = \sum_{i=1}^n I(X_i - t_0 > 0) \sim B(n, \theta)
Rank Sum (W)

Sum of ranks assigned to one sample group in Wilcoxon rank sum test for comparing two independent samples.

W=i=1nRi where Ri is rank of YiW = \sum_{i=1}^n R_i \text{ where } R_i \text{ is rank of } Y_i
Empirical Distribution Function

Step function that estimates the cumulative distribution function from sample data, used in Kolmogorov-Smirnov tests.

Fn(x)=1ni=1nI(Xix)F_n(x) = \frac{1}{n} \sum_{i=1}^n I(X_i \leq x)

Fundamental Concepts

Core principles and advantages of nonparametric hypothesis testing

Core Principles

Distribution-Free Nature

Tests do not require specific distributional assumptions about the population, making them robust and applicable to diverse data types and situations.

Rank-Based Analysis

Many nonparametric tests use rank information instead of raw values, reducing sensitivity to outliers and extreme observations.

Ordinal Data Compatibility

Suitable for ordinal, interval, and ratio scale data, providing flexibility for different measurement levels and research contexts.

Key Advantages

Robustness

Resistant to outliers, non-normal distributions, and violations of parametric test assumptions, providing reliable results across diverse scenarios.

Small Sample Efficiency

Effective with small sample sizes where parametric test assumptions may not hold, making them valuable for pilot studies and limited data.

Broad Applicability

Applicable to categorical, ordinal, and continuous data without requiring transformation or complex distributional modeling.

Nonparametric Test Methods

Comprehensive overview of major nonparametric hypothesis testing procedures

Sign Test
Tests median values and paired sample differences using only sign information

Application Scenarios

Single sample median testing
Paired sample comparisons
Ordinal data analysis

Key Formulas

Singlesample:N+=I(Xi>t0)B(n,θ)Single sample: N^+ = \sum I(X_i > t_0) \sim B(n, \theta)
Pairedsample:N+=I(Zi>0)B(n,1/2) under H0Paired sample: N^+ = \sum I(Z_i > 0) \sim B(n, 1/2) \text{ under } H_0
Rejectionregion(righttail):N+CRejection region (right-tail): N^+ \geq C^*
Rejectionregion(twotail):N+C1 or N+C2Rejection region (two-tail): N^+ \leq C_1^* \text{ or } N^+ \geq C_2^*

Methodology Steps

1Convert sample values to binary indicators (positive/negative)
2Count number of positive differences (N⁺)
3Use binomial distribution for significance testing
4Apply appropriate critical values based on alternative hypothesis

Advantages

Distribution-free
Robust to outliers
Simple computation

Limitations

Ignores magnitude information
Lower power than parametric tests
Wilcoxon Rank Sum Test
Compares two independent samples using rank information (Mann-Whitney U equivalent)

Application Scenarios

Two independent sample comparison
Location shift detection
Ordinal data comparison

Key Formulas

Ranksum:W=i=1nRi (ranks of second sample)Rank sum: W = \sum_{i=1}^n R_i \text{ (ranks of second sample)}
Expectedvalue:E[W]=n(m+n+1)2Expected value: E[W] = \frac{n(m+n+1)}{2}
Variance:Var(W)=mn(m+n+1)12Variance: \text{Var}(W) = \frac{mn(m+n+1)}{12}
Largesample:W=WE[W]Var(W)N(0,1)Large sample: W^* = \frac{W - E[W]}{\sqrt{\text{Var}(W)}} \sim N(0,1)

Methodology Steps

1Combine and rank all observations from both samples
2Calculate sum of ranks for one sample group
3Compare rank sum to expected value under null hypothesis
4Use normal approximation for large samples

Advantages

Uses magnitude information
More powerful than sign test
Handles tied observations

Limitations

Requires ordinal or continuous data
Assumes same distribution shape
Wilcoxon Signed Rank Test
Analyzes paired samples using both sign and magnitude information through ranks

Application Scenarios

Paired sample analysis
Before-after comparisons
Symmetric distribution testing

Key Formulas

Signedrankstatistic:W+=i=1nRiI(Zi>0)Signed rank statistic: W^+ = \sum_{i=1}^n R_i I(Z_i > 0)
Expectedvalue:E[W+]=n(n+1)4Expected value: E[W^+] = \frac{n(n+1)}{4}
Variance:Var(W+)=n(n+1)(2n+1)24Variance: \text{Var}(W^+) = \frac{n(n+1)(2n+1)}{24}
Largesample:W+=W+E[W+]Var(W+)N(0,1)Large sample: W^{*+} = \frac{W^+ - E[W^+]}{\sqrt{\text{Var}(W^+)}} \sim N(0,1)

Methodology Steps

1Calculate differences between paired observations
2Rank absolute values of non-zero differences
3Sum ranks corresponding to positive differences
4Compare to expected value under null hypothesis

Advantages

Combines sign and magnitude
More powerful than sign test
Robust to outliers

Limitations

Assumes symmetric distribution
Cannot handle extreme ties
Chi-Square Goodness-of-Fit Test
Tests whether sample data follows a specified theoretical distribution

Application Scenarios

Distribution fitting
Model validation
Categorical data analysis

Key Formulas

Teststatistic:χ2=i=1r(OiEi)2EiTest statistic: \chi^2 = \sum_{i=1}^r \frac{(O_i - E_i)^2}{E_i}
Degreesoffreedom:df=rm1Degrees of freedom: df = r - m - 1
Expectedfrequency:Ei=npi(θ^)Expected frequency: E_i = n \cdot p_i(\hat{\theta})
Rejectionregion:χ2>χα2(df)Rejection region: \chi^2 > \chi^2_{\alpha}(df)

Methodology Steps

1Group data into categories or intervals
2Calculate observed frequencies for each category
3Compute expected frequencies under null hypothesis
4Calculate chi-square test statistic

Advantages

Flexible for any distribution
Handles categorical data
Well-established theory

Limitations

Requires large samples
Loses information through grouping
Sensitive to category choice
Kolmogorov-Smirnov Test
Compares empirical and theoretical distributions using maximum absolute difference

Application Scenarios

Distribution fitting
Two-sample comparison
Continuous data analysis

Key Formulas

Teststatistic:Dn=supxFn(x)F0(x)Test statistic: D_n = \sup_x |F_n(x) - F_0(x)|
Twosample:Dm,n=supxF1,m(x)F2,n(x)Two-sample: D_{m,n} = \sup_x |F_{1,m}(x) - F_{2,n}(x)|
Largesample:nDnK(λ) (Kolmogorov distribution)Large sample: \sqrt{n}D_n \to K(\lambda) \text{ (Kolmogorov distribution)}
Criticalvalue:Dn>Dn,α reject H0Critical value: D_n > D_{n,\alpha} \text{ reject } H_0

Methodology Steps

1Construct empirical distribution function from sample
2Calculate maximum absolute difference with theoretical CDF
3Compare test statistic to critical values
4Use asymptotic distribution for large samples

Advantages

No grouping required
Uses all sample information
Distribution-free

Limitations

Only for continuous data
Conservative test
Sensitive to ties
Chi-Square Independence Test
Tests independence between two categorical variables using contingency tables

Application Scenarios

Variable independence
Association analysis
Categorical data relationships

Key Formulas

Teststatistic:χ2=i=1rj=1s(OijEij)2EijTest statistic: \chi^2 = \sum_{i=1}^r \sum_{j=1}^s \frac{(O_{ij} - E_{ij})^2}{E_{ij}}
Expectedfrequency:Eij=ninjnExpected frequency: E_{ij} = \frac{n_{i\cdot} n_{\cdot j}}{n}
Degreesoffreedom:df=(r1)(s1)Degrees of freedom: df = (r-1)(s-1)
Rejectionregion:χ2>χα2((r1)(s1))Rejection region: \chi^2 > \chi^2_{\alpha}((r-1)(s-1))

Methodology Steps

1Organize data into r×s contingency table
2Calculate expected frequencies under independence
3Compute chi-square test statistic
4Compare to critical value with appropriate degrees of freedom

Advantages

Tests general association
Handles multiple categories
Intuitive interpretation

Limitations

Requires adequate sample sizes
Only detects association, not causation
Sensitive to small expected frequencies
Run Test
Tests randomness of sequences by counting runs of consecutive identical symbols

Application Scenarios

Randomness testing
Pattern detection
Time series analysis

Key Formulas

Runcount:R=number of runs in sequenceRun count: R = \text{number of runs in sequence}
Expectedruns:E[R]=2n1n2n1+n2+1Expected runs: E[R] = \frac{2n_1n_2}{n_1+n_2} + 1
Variance:Var(R)=2n1n2(2n1n2n1n2)(n1+n2)2(n1+n21)Variance: \text{Var}(R) = \frac{2n_1n_2(2n_1n_2-n_1-n_2)}{(n_1+n_2)^2(n_1+n_2-1)}
Largesample:RE[R]Var(R)N(0,1)Large sample: \frac{R - E[R]}{\sqrt{\text{Var}(R)}} \sim N(0,1)

Methodology Steps

1Convert sequence to binary (0-1) format
2Count total number of runs (consecutive identical symbols)
3Compare observed runs to expected under randomness
4Use normal approximation for large samples

Advantages

Simple randomness test
No distributional assumptions
Detects systematic patterns

Limitations

Only detects certain patterns
Less powerful than other tests
Binary conversion loses information

Worked Examples

Step-by-step solutions to nonparametric testing problems

Sign Test for Population Median

Problem:

A manufacturer claims that the median lifetime of their light bulbs is 1000 hours. A consumer group tests 12 bulbs and obtains the following lifetimes (in hours): 985, 1010, 992, 1008, 1015, 988, 1020, 995, 1012, 990, 1005, 1002. Test the claim at α = 0.05 using the Sign Test.

Solution:

  1. 1

    State hypotheses

    H₀: M = 1000 (population median is 1000 hours)

    H1:M1000 (two-tailed test)H_1: M \neq 1000 \text{ (two-tailed test)}
  2. 2

    Compute signs

    Compare each observation to the hypothesized median M₀ = 1000:

    Signs: ,+,,+,+,,+,,+,,+,+ (exclude ties)\text{Signs: } -, +, -, +, +, -, +, -, +, -, +, + \text{ (exclude ties)}
  3. 3

    Count positive signs

    Count the number of observations above and below the median:

    N+=7,N=5,n=12N^+ = 7, \quad N^- = 5, \quad n = 12
  4. 4

    Find test statistic

    Under H₀, N⁺ ~ Binomial(12, 0.5). Use the smaller count:

    Test statistic=min(N+,N)=min(7,5)=5\text{Test statistic} = \min(N^+, N^-) = \min(7, 5) = 5
  5. 5

    Determine critical value

    For two-tailed test at α = 0.05 with n = 12, from binomial table:

    P(N+2 or N+10)0.039<0.05P(N^+ \leq 2 \text{ or } N^+ \geq 10) \approx 0.039 < 0.05
  6. 6

    Conclusion

    Since 5 > 2 (critical value), we fail to reject H₀.

    Conclusion: Not enough evidence to reject the claim that median is 1000 hours\text{Conclusion: Not enough evidence to reject the claim that median is 1000 hours}

Key Insight:

The Sign Test only uses directional information (+/-) and ignores magnitude. With 7 positive and 5 negative differences, the result is not extreme enough to reject H₀ at the 5% significance level.

Sign Test for Paired Samples

Problem:

A study compares blood pressure before and after a new medication for 10 patients. The differences (After - Before) are: -8, -5, +2, -12, -3, +1, -6, -9, -4, -7. Test whether the medication reduces blood pressure at α = 0.05.

Solution:

  1. 1

    State hypotheses

    Let M_D be the median of differences (After - Before):

    H0:MD=0,H1:MD<0 (one-tailed, reduction expected)H_0: M_D = 0, \quad H_1: M_D < 0 \text{ (one-tailed, reduction expected)}
  2. 2

    Count signs

    Count positive and negative differences:

    N+=2 (positive),N=8 (negative),n=10N^+ = 2 \text{ (positive)}, \quad N^- = 8 \text{ (negative)}, \quad n = 10
  3. 3

    Identify test statistic

    For one-tailed test (reduction), use N⁺ as the test statistic:

    Test statistic=N+=2\text{Test statistic} = N^+ = 2
  4. 4

    Find p-value

    Under H₀, N⁺ ~ Binomial(10, 0.5). Calculate:

    P(N+2)=k=02(10k)(0.5)10=1+10+451024=0.055P(N^+ \leq 2) = \sum_{k=0}^{2} \binom{10}{k}(0.5)^{10} = \frac{1 + 10 + 45}{1024} = 0.055
  5. 5

    Make decision

    Compare p-value to α:

    p=0.055>α=0.05Fail to reject H0p = 0.055 > \alpha = 0.05 \Rightarrow \text{Fail to reject } H_0
  6. 6

    Conclusion

    At α = 0.05, there is insufficient evidence to conclude that the medication reduces blood pressure.

    Note: At α=0.10, we would reject H0\text{Note: At } \alpha = 0.10, \text{ we would reject } H_0

Key Insight:

The Sign Test for paired data is equivalent to testing whether the median of differences equals zero. With 8 negative out of 10 differences, the p-value (0.055) is borderline but does not quite reach the 0.05 threshold.

Wilcoxon Rank Sum Test (Mann-Whitney U)

Problem:

Two teaching methods are compared by test scores. Method A: 78, 82, 85, 88, 90 (n₁=5). Method B: 72, 76, 80, 84, 86, 92, 94 (n₂=7). Test whether there is a significant difference at α = 0.05.

Solution:

  1. 1

    State hypotheses

    Let F₁ and F₂ be the distributions of Methods A and B:

    H0:F1(x)=F2(x),H1:F1(x)F2(x)H_0: F_1(x) = F_2(x), \quad H_1: F_1(x) \neq F_2(x)
  2. 2

    Combine and rank all data

    Order all 12 observations and assign ranks:

    Value727678808284858688909294Rank123456789101112GroupBBABABABAABB\begin{array}{c|cccccccccccc} \text{Value} & 72 & 76 & 78 & 80 & 82 & 84 & 85 & 86 & 88 & 90 & 92 & 94 \\ \text{Rank} & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 \\ \text{Group} & B & B & A & B & A & B & A & B & A & A & B & B \end{array}
  3. 3

    Calculate rank sum for smaller sample

    Sum of ranks for Method A (n₁=5):

    WA=3+5+7+9+10=34W_A = 3 + 5 + 7 + 9 + 10 = 34
  4. 4

    Calculate expected value and variance

    Under H₀:

    E[WA]=n1(n1+n2+1)2=5(13)2=32.5E[W_A] = \frac{n_1(n_1+n_2+1)}{2} = \frac{5(13)}{2} = 32.5
  5. 5

    Calculate variance

    Variance formula:

    Var(WA)=n1n2(n1+n2+1)12=5×7×1312=37.917Var(W_A) = \frac{n_1 n_2 (n_1+n_2+1)}{12} = \frac{5 \times 7 \times 13}{12} = 37.917
  6. 6

    Compute Z-statistic

    Standardize the rank sum:

    Z=WAE[WA]Var(WA)=3432.537.917=1.56.158=0.244Z = \frac{W_A - E[W_A]}{\sqrt{Var(W_A)}} = \frac{34 - 32.5}{\sqrt{37.917}} = \frac{1.5}{6.158} = 0.244
  7. 7

    Make decision

    For two-tailed test at α = 0.05, critical value is z₀.₀₂₅ = 1.96:

    Z=0.244<1.96Fail to reject H0|Z| = 0.244 < 1.96 \Rightarrow \text{Fail to reject } H_0

Key Insight:

The Wilcoxon Rank Sum Test compares the sum of ranks between groups. If one group consistently has higher values, its rank sum will be higher than expected under H₀. Here, the rank sums are nearly equal to expected values, indicating no significant difference.

Wilcoxon Signed Rank Test

Problem:

A diet program is tested on 8 participants. Weight changes (Before - After, in kg) are: 2.1, -0.5, 3.2, 1.8, 4.5, 0.9, 2.7, 1.5. Test if the diet causes significant weight loss at α = 0.05.

Solution:

  1. 1

    State hypotheses

    Let M_D be the median of weight differences:

    H0:MD=0,H1:MD>0 (one-tailed, weight loss expected)H_0: M_D = 0, \quad H_1: M_D > 0 \text{ (one-tailed, weight loss expected)}
  2. 2

    Rank absolute differences

    Order by |difference| and assign ranks:

    D0.50.91.51.82.12.73.24.5Rank12345678Sign+++++++\begin{array}{c|cccccccc} |D| & 0.5 & 0.9 & 1.5 & 1.8 & 2.1 & 2.7 & 3.2 & 4.5 \\ \text{Rank} & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\ \text{Sign} & - & + & + & + & + & + & + & + \end{array}
  3. 3

    Calculate signed rank statistics

    Sum ranks by sign:

    W+=2+3+4+5+6+7+8=35,W=1W^+ = 2+3+4+5+6+7+8 = 35, \quad W^- = 1
  4. 4

    Calculate expected value and variance

    Under H₀:

    E[W+]=n(n+1)4=8(9)4=18,Var(W+)=n(n+1)(2n+1)24=8(9)(17)24=51E[W^+] = \frac{n(n+1)}{4} = \frac{8(9)}{4} = 18, \quad Var(W^+) = \frac{n(n+1)(2n+1)}{24} = \frac{8(9)(17)}{24} = 51
  5. 5

    Compute Z-statistic

    Standardize:

    Z=W+E[W+]Var(W+)=351851=177.14=2.38Z = \frac{W^+ - E[W^+]}{\sqrt{Var(W^+)}} = \frac{35 - 18}{\sqrt{51}} = \frac{17}{7.14} = 2.38
  6. 6

    Make decision

    For one-tailed test at α = 0.05, critical value is z₀.₀₅ = 1.645:

    Z=2.38>1.645Reject H0Z = 2.38 > 1.645 \Rightarrow \text{Reject } H_0
  7. 7

    Conclusion

    There is significant evidence that the diet program causes weight loss.

    p-value=P(Z>2.38)0.0087<0.05p\text{-value} = P(Z > 2.38) \approx 0.0087 < 0.05

Key Insight:

The Wilcoxon Signed Rank Test uses both sign and magnitude information through ranks. With 7 positive and only 1 negative difference, and the positive differences being larger (higher ranks), the evidence strongly supports weight loss.

Chi-Square Goodness-of-Fit Test

Problem:

A die is rolled 120 times with results: 1→18, 2→22, 3→17, 4→25, 5→16, 6→22. Test whether the die is fair at α = 0.05.

Solution:

  1. 1

    State hypotheses

    For a fair die, each face has probability 1/6:

    H0:p1=p2==p6=16H_0: p_1 = p_2 = \cdots = p_6 = \frac{1}{6}
  2. 2

    Calculate expected frequencies

    Under H₀, each face should appear:

    Ei=npi=120×16=20 times eachE_i = np_i = 120 \times \frac{1}{6} = 20 \text{ times each}
  3. 3

    Calculate chi-square statistic

    Compute the test statistic:

    χ2=i=16(OiEi)2Ei\chi^2 = \sum_{i=1}^{6} \frac{(O_i - E_i)^2}{E_i}
  4. 4

    Substitute values

    Calculate each term:

    χ2=(1820)220+(2220)220+(1720)220+(2520)220+(1620)220+(2220)220\chi^2 = \frac{(18-20)^2}{20} + \frac{(22-20)^2}{20} + \frac{(17-20)^2}{20} + \frac{(25-20)^2}{20} + \frac{(16-20)^2}{20} + \frac{(22-20)^2}{20}
  5. 5

    Compute result

    Sum all terms:

    χ2=4+4+9+25+16+420=6220=3.1\chi^2 = \frac{4 + 4 + 9 + 25 + 16 + 4}{20} = \frac{62}{20} = 3.1
  6. 6

    Determine critical value

    Degrees of freedom = k - 1 = 6 - 1 = 5:

    χ0.05,52=11.07\chi^2_{0.05,5} = 11.07
  7. 7

    Conclusion

    Compare test statistic to critical value:

    χ2=3.1<11.07Fail to reject H0\chi^2 = 3.1 < 11.07 \Rightarrow \text{Fail to reject } H_0

Key Insight:

The chi-square goodness-of-fit test compares observed frequencies to expected frequencies. The deviation (χ² = 3.1) is well below the critical value (11.07), indicating the observed distribution is consistent with a fair die.

Kolmogorov-Smirnov Test

Problem:

Test whether the sample {0.12, 0.35, 0.47, 0.62, 0.81, 0.93} comes from a Uniform(0,1) distribution at α = 0.05.

Solution:

  1. 1

    State hypotheses

    Test goodness of fit to Uniform(0,1):

    H0:F(x)=x for 0x1H_0: F(x) = x \text{ for } 0 \leq x \leq 1
  2. 2

    Construct empirical CDF

    The empirical CDF is a step function:

    Fn(x)=number of XixnF_n(x) = \frac{\text{number of } X_i \leq x}{n}
  3. 3

    Calculate D⁺ and D⁻

    For each order statistic x₍ᵢ₎:

    D+=maxi{inF0(x(i))},D=maxi{F0(x(i))i1n}D^+ = \max_i \left\{\frac{i}{n} - F_0(x_{(i)})\right\}, \quad D^- = \max_i \left\{F_0(x_{(i)}) - \frac{i-1}{n}\right\}
  4. 4

    Create computation table

    With n=6 and F₀(x) = x:

    ix(i)i/nF0(x(i))Di+=i/nx10.120.1670.120.04720.350.3330.350.01730.470.5000.470.03040.620.6670.620.04750.810.8330.810.02360.931.0000.930.070\begin{array}{c|c|c|c|c} i & x_{(i)} & i/n & F_0(x_{(i)}) & D_i^+ = i/n - x \\ 1 & 0.12 & 0.167 & 0.12 & 0.047 \\ 2 & 0.35 & 0.333 & 0.35 & -0.017 \\ 3 & 0.47 & 0.500 & 0.47 & 0.030 \\ 4 & 0.62 & 0.667 & 0.62 & 0.047 \\ 5 & 0.81 & 0.833 & 0.81 & 0.023 \\ 6 & 0.93 & 1.000 & 0.93 & 0.070 \end{array}
  5. 5

    Find maximum deviations

    Calculate D⁺, D⁻ and Dₙ:

    D+=0.070,D=0.1200=0.120,Dn=max(D+,D)=0.120D^+ = 0.070, \quad D^- = 0.120 - 0 = 0.120, \quad D_n = \max(D^+, D^-) = 0.120
  6. 6

    Compare to critical value

    For n=6 and α=0.05, from K-S table:

    D6,0.05=0.521D_{6,0.05} = 0.521
  7. 7

    Conclusion

    Compare test statistic to critical value:

    Dn=0.120<0.521Fail to reject H0D_n = 0.120 < 0.521 \Rightarrow \text{Fail to reject } H_0

Key Insight:

The K-S test measures the maximum vertical distance between empirical and theoretical CDFs. With Dₙ = 0.120 far below the critical value 0.521, there's no evidence against the Uniform(0,1) hypothesis.

Chi-Square Test for Independence

Problem:

A survey of 200 people examines the relationship between education level and voting preference. Test independence at α = 0.05. Data: High School (A:30, B:40), Bachelor's (A:50, B:35), Graduate (A:20, B:25).

Solution:

  1. 1

    State hypotheses

    Test independence of education and voting:

    H0:Education and Voting are independentH_0: \text{Education and Voting are independent}
  2. 2

    Create contingency table

    Organize observed frequencies:

    ABTotalHigh School304070Bachelor’s503585Graduate202545Total100100200\begin{array}{c|cc|c} & A & B & \text{Total} \\ \text{High School} & 30 & 40 & 70 \\ \text{Bachelor's} & 50 & 35 & 85 \\ \text{Graduate} & 20 & 25 & 45 \\ \text{Total} & 100 & 100 & 200 \end{array}
  3. 3

    Calculate expected frequencies

    Under independence: E_{ij} = (Row_i × Col_j) / Total

    E11=70×100200=35,E12=35,E21=42.5,E22=42.5,E31=22.5,E32=22.5E_{11} = \frac{70 \times 100}{200} = 35, \quad E_{12} = 35, \quad E_{21} = 42.5, \quad E_{22} = 42.5, \quad E_{31} = 22.5, \quad E_{32} = 22.5
  4. 4

    Calculate chi-square statistic

    Compute each term:

    χ2=(3035)235+(4035)235+(5042.5)242.5+(3542.5)242.5+(2022.5)222.5+(2522.5)222.5\chi^2 = \frac{(30-35)^2}{35} + \frac{(40-35)^2}{35} + \frac{(50-42.5)^2}{42.5} + \frac{(35-42.5)^2}{42.5} + \frac{(20-22.5)^2}{22.5} + \frac{(25-22.5)^2}{22.5}
  5. 5

    Compute result

    Sum all terms:

    χ2=0.714+0.714+1.324+1.324+0.278+0.278=4.632\chi^2 = 0.714 + 0.714 + 1.324 + 1.324 + 0.278 + 0.278 = 4.632
  6. 6

    Determine degrees of freedom

    For r×c table:

    df=(r1)(c1)=(31)(21)=2df = (r-1)(c-1) = (3-1)(2-1) = 2
  7. 7

    Find critical value and conclude

    From chi-square table:

    χ0.05,22=5.99,χ2=4.632<5.99Fail to reject H0\chi^2_{0.05,2} = 5.99, \quad \chi^2 = 4.632 < 5.99 \Rightarrow \text{Fail to reject } H_0

Key Insight:

The chi-square test for independence compares observed cell frequencies to those expected under independence. With χ² = 4.632 < 5.99, we cannot conclude that education level and voting preference are related.

Run Test for Randomness

Problem:

A sequence of stock price movements shows: + + + - - + + - - - + + + + - - + - +. Test whether the sequence is random at α = 0.05.

Solution:

  1. 1

    State hypotheses

    Test randomness of the sequence:

    H0:Sequence is random,H1:Sequence is not randomH_0: \text{Sequence is random}, \quad H_1: \text{Sequence is not random}
  2. 2

    Count runs and symbols

    A run is a sequence of identical symbols:

    Sequence: +++12++34++++56+78+9\text{Sequence: } \underbrace{+++}_{1} \underbrace{--}_{2} \underbrace{++}_{3} \underbrace{---}_{4} \underbrace{++++}_{5} \underbrace{--}_{6} \underbrace{+}_{7} \underbrace{-}_{8} \underbrace{+}_{9}
  3. 3

    Determine counts

    Count total runs and each symbol:

    R=9,n1=11 (plus),n2=8 (minus),n=19R = 9, \quad n_1 = 11 \text{ (plus)}, \quad n_2 = 8 \text{ (minus)}, \quad n = 19
  4. 4

    Calculate expected runs

    Under randomness:

    E[R]=2n1n2n1+n2+1=2(11)(8)19+1=9.26+1=10.26E[R] = \frac{2n_1n_2}{n_1+n_2} + 1 = \frac{2(11)(8)}{19} + 1 = 9.26 + 1 = 10.26
  5. 5

    Calculate variance

    Variance of R under H₀:

    Var(R)=2n1n2(2n1n2n1n2)(n1+n2)2(n1+n21)=2(11)(8)(17619)361×18=27726498=4.27Var(R) = \frac{2n_1n_2(2n_1n_2-n_1-n_2)}{(n_1+n_2)^2(n_1+n_2-1)} = \frac{2(11)(8)(176-19)}{361 \times 18} = \frac{2772}{6498} = 4.27
  6. 6

    Compute Z-statistic

    Standardize:

    Z=RE[R]Var(R)=910.264.27=1.262.07=0.61Z = \frac{R - E[R]}{\sqrt{Var(R)}} = \frac{9 - 10.26}{\sqrt{4.27}} = \frac{-1.26}{2.07} = -0.61
  7. 7

    Make decision

    For two-tailed test at α = 0.05:

    Z=0.61<1.96Fail to reject H0|Z| = 0.61 < 1.96 \Rightarrow \text{Fail to reject } H_0

Key Insight:

The Run Test detects departures from randomness. Too few runs suggest clustering (positive autocorrelation), too many suggest alternation (negative autocorrelation). With R = 9 close to the expected 10.26, the sequence appears random.

Core Theorem Proofs

Mathematical derivations of key nonparametric statistics

Asymptotic Distribution of Sign Test Statistic
Basis for Large Sample Sign Test

For large n, the sign test statistic N⁺ converges in distribution to a Normal distribution.

Theorem Statement

N+n/2n/2dN(0,1)\frac{N^+ - n/2}{\sqrt{n}/2} \xrightarrow{d} N(0,1)

This allows us to use the standard normal table for hypothesis testing when n > 20.

Proof Steps

1
Define Indicator Variables

Let Iᵢ = 1 if Xᵢ > M₀ and 0 otherwise. Under H₀: M = M₀, P(Xᵢ > M₀) = 0.5.

IiBernoulli(0.5)I_i \sim \text{Bernoulli}(0.5)
2
Sum of Indicators

The test statistic N⁺ is the sum of these i.i.d. Bernoulli trials.

N+=i=1nIiBinomial(n,0.5)N^+ = \sum_{i=1}^n I_i \sim \text{Binomial}(n, 0.5)
3
Calculate Moments

The mean and variance of N⁺ under H₀ are:

E[N+]=np=0.5n,Var(N+)=np(1p)=0.25nE[N^+] = np = 0.5n, \quad Var(N^+) = np(1-p) = 0.25n
4
Apply Central Limit Theorem

Since Iᵢ are i.i.d. with finite variance, the standardized sum converges to standard normal.

Z=N+E[N+]Var(N+)=N+0.5n0.5ndN(0,1)Z = \frac{N^+ - E[N^+]}{\sqrt{Var(N^+)}} = \frac{N^+ - 0.5n}{0.5\sqrt{n}} \xrightarrow{d} N(0,1)
5
Continuity Correction

For better approximation, we often apply a continuity correction of 0.5.

Zcorrected=N+0.5n0.50.5nZ_{corrected} = \frac{|N^+ - 0.5n| - 0.5}{0.5\sqrt{n}}
6
Conclusion

Thus, for large n, we can reject H₀ if |Z| > z_{α/2}.

P(Z>zα/2)αP(|Z| > z_{\alpha/2}) \approx \alpha

Example Application

Forn=100,N+=60.Z=(6050)/5=2.0.Since2.0>1.96,rejectH0atα=0.05.For n=100, N⁺=60. Z = (60-50)/5 = 2.0. Since 2.0 > 1.96, reject H₀ at α=0.05.
Moments of Wilcoxon Rank-Sum Statistic
Foundation for Normal Approximation

Derivation of the mean and variance of the Rank-Sum statistic W under the null hypothesis.

Theorem Statement

E[W]=n(m+n+1)2,Var(W)=mn(m+n+1)12E[W] = \frac{n(m+n+1)}{2}, \quad Var(W) = \frac{mn(m+n+1)}{12}

These moments are crucial for constructing the Z-statistic for large samples.

Proof Steps

1
Define Rank Sum

Let R₁, …, R_{m+n} be the ranks of the combined sample. Under H₀, any subset of size n is equally likely to be the ranks of the second sample.

W=j=1nRj (sum of n random ranks)W = \sum_{j=1}^n R_{j} \text{ (sum of } n \text{ random ranks)}
2
Expectation of a Single Rank

The average rank in a set of size N=m+n is (1+N)/2.

E[Rj]=1Nk=1Nk=N(N+1)2N=N+12E[R_j] = \frac{1}{N} \sum_{k=1}^N k = \frac{N(N+1)}{2N} = \frac{N+1}{2}
3
Expectation of W

By linearity of expectation:

E[W]=j=1nE[Rj]=nm+n+12E[W] = \sum_{j=1}^n E[R_j] = n \frac{m+n+1}{2}
4
Variance of W

Since ranks are sampled without replacement, they are not independent. Var(W) = ΣVar(Rⱼ) + Σⱼ≠ₖCov(Rⱼ,Rₖ).

Var(Rj)=N2112,Cov(Rj,Rk)=N+112Var(R_j) = \frac{N^2-1}{12}, \quad Cov(R_j, R_k) = -\frac{N+1}{12}
5
Algebraic Simplification

Summing the variance and covariance terms (detailed algebra omitted for brevity) yields:

Var(W)=mn(m+n+1)12Var(W) = \frac{mn(m+n+1)}{12}
6
Normal Approximation

As m, n → ∞, the distribution of W approaches normality.

Z=WE[W]Var(W)N(0,1)Z = \frac{W - E[W]}{\sqrt{Var(W)}} \sim N(0,1)

Example Application

Form=10,n=10,E[W]=10(21)/2=105,Var(W)=100(21)/12=175.For m=10, n=10, E[W] = 10(21)/2 = 105, Var(W) = 100(21)/12 = 175.
Distribution of Wilcoxon Signed Rank Statistic
Foundation for Paired Sample Inference

Under H₀ (symmetric distribution around 0), the signed rank statistic W⁺ has a known distribution based on random sign assignments.

Theorem Statement

W+=i=1nRiI(Zi>0),E[W+]=n(n+1)4,Var(W+)=n(n+1)(2n+1)24W^+ = \sum_{i=1}^n R_i \cdot I(Z_i > 0), \quad E[W^+] = \frac{n(n+1)}{4}, \quad Var(W^+) = \frac{n(n+1)(2n+1)}{24}

This enables both exact small-sample tests and large-sample normal approximations.

Proof Steps

1
Setup Under Null Hypothesis

Under H₀: the distribution is symmetric about 0. Thus, for each |Zᵢ|, P(Zᵢ > 0) = P(Zᵢ < 0) = 0.5.

Signs Si=I(Zi>0) are i.i.d. Bernoulli(0.5)\text{Signs } S_i = I(Z_i > 0) \text{ are i.i.d. Bernoulli}(0.5)
2
Express W⁺ as Random Sum

Let Rᵢ be the rank of |Zᵢ|. The signed rank statistic is:

W+=i=1nRiSi=i=1nRiI(Zi>0)W^+ = \sum_{i=1}^n R_i \cdot S_i = \sum_{i=1}^n R_i \cdot I(Z_i > 0)
3
Derive Expected Value

Since E[Sᵢ] = 0.5 and ranks are fixed (1, 2, ..., n):

E[W+]=i=1nRiE[Si]=0.5i=1ni=0.5n(n+1)2=n(n+1)4E[W^+] = \sum_{i=1}^n R_i \cdot E[S_i] = 0.5 \sum_{i=1}^n i = 0.5 \cdot \frac{n(n+1)}{2} = \frac{n(n+1)}{4}
4
Derive Variance

Since signs are independent and Var(Sᵢ) = 0.25:

Var(W+)=i=1nRi2Var(Si)=0.25i=1ni2=0.25n(n+1)(2n+1)6=n(n+1)(2n+1)24Var(W^+) = \sum_{i=1}^n R_i^2 \cdot Var(S_i) = 0.25 \sum_{i=1}^n i^2 = 0.25 \cdot \frac{n(n+1)(2n+1)}{6} = \frac{n(n+1)(2n+1)}{24}
5
Exact Distribution

For small n, the exact distribution can be enumerated. There are 2ⁿ equally likely sign patterns:

P(W+=w)=number of sign patterns giving sum w2nP(W^+ = w) = \frac{\text{number of sign patterns giving sum } w}{2^n}
6
Normal Approximation

For large n (typically n ≥ 20), the CLT applies:

Z=W+n(n+1)4n(n+1)(2n+1)24dN(0,1)Z = \frac{W^+ - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n+1)(2n+1)}{24}}} \xrightarrow{d} N(0,1)

Example Application

Forn=8:E[W+]=8(9)/4=18,Var(W+)=8(9)(17)/24=51,SD=7.14.For n=8: E[W⁺] = 8(9)/4 = 18, Var(W⁺) = 8(9)(17)/24 = 51, SD = 7.14.
Chi-Square Asymptotic Distribution
Basis for Goodness-of-Fit and Independence Tests

Pearson's chi-square statistic converges to a chi-square distribution under the null hypothesis.

Theorem Statement

χ2=i=1k(OiEi)2Eidχkm12\chi^2 = \sum_{i=1}^k \frac{(O_i - E_i)^2}{E_i} \xrightarrow{d} \chi^2_{k-m-1}

This provides the theoretical foundation for chi-square tests in categorical data analysis.

Proof Steps

1
Multinomial Setup

Let (O₁, ..., Oₖ) follow Multinomial(n; p₁, ..., pₖ) where Eᵢ = npᵢ.

(O1,,Ok)Multinomial(n;p1,,pk)(O_1, \ldots, O_k) \sim \text{Multinomial}(n; p_1, \ldots, p_k)
2
Asymptotic Normality

By the multivariate CLT, the standardized frequencies are asymptotically normal:

OinpinpidN(0,1pi)\frac{O_i - np_i}{\sqrt{np_i}} \xrightarrow{d} N(0, 1 - p_i)
3
Pearson's Statistic as Quadratic Form

The chi-square statistic can be written as:

χ2=i=1k(OiEi)2Ei=ni=1k(p^ipi)2pi\chi^2 = \sum_{i=1}^k \frac{(O_i - E_i)^2}{E_i} = n \sum_{i=1}^k \frac{(\hat{p}_i - p_i)^2}{p_i}
4
Covariance Structure

The multinomial covariance creates dependence. The quadratic form becomes:

χ2=ZTΣ1Z where ZN(0,Σ)\chi^2 = \mathbf{Z}^T \Sigma^{-1} \mathbf{Z} \text{ where } \mathbf{Z} \sim N(\mathbf{0}, \Sigma)
5
Degrees of Freedom Reduction

The constraint Σpᵢ = 1 reduces effective dimensions by 1. If m parameters are estimated, subtract m more:

df=k1m=km1df = k - 1 - m = k - m - 1
6
Limiting Distribution

By properties of quadratic forms of normal vectors:

χ2dχkm12 as n\chi^2 \xrightarrow{d} \chi^2_{k-m-1} \text{ as } n \to \infty

Example Application

Fork=6categorieswithnoestimatedparameters:df=61=5.Criticalvalueatα=0.05isχ02.05,5=11.07.For k=6 categories with no estimated parameters: df = 6-1 = 5. Critical value at α=0.05 is χ²₀.₀₅,₅ = 11.07.
Kolmogorov-Smirnov Limiting Distribution
Exact Distribution-Free Testing

The scaled K-S statistic √n·Dₙ converges to the Kolmogorov distribution.

Theorem Statement

nDn=nsupxFn(x)F0(x)dK\sqrt{n} D_n = \sqrt{n} \sup_x |F_n(x) - F_0(x)| \xrightarrow{d} K

The Kolmogorov distribution K has CDF: P(K ≤ x) = 1 - 2Σ(-1)^(j-1)e^(-2j²x²).

Proof Steps

1
Empirical Process Definition

Define the empirical process:

αn(x)=n[Fn(x)F0(x)]\alpha_n(x) = \sqrt{n}[F_n(x) - F_0(x)]
2
Glivenko-Cantelli Theorem

First, we need the uniform law of large numbers:

supxFn(x)F(x)a.s.0 as n\sup_x |F_n(x) - F(x)| \xrightarrow{a.s.} 0 \text{ as } n \to \infty
3
Donsker's Theorem

The empirical process converges to a Brownian bridge B(t) on [0,1]:

αn(F1(t))dB(t) in D[0,1]\alpha_n(F^{-1}(t)) \xrightarrow{d} B(t) \text{ in } D[0,1]
4
Continuous Mapping

Apply the continuous mapping theorem to the supremum functional:

nDn=suptαn(F1(t))dsup0t1B(t)\sqrt{n} D_n = \sup_t |\alpha_n(F^{-1}(t))| \xrightarrow{d} \sup_{0 \leq t \leq 1} |B(t)|
5
Brownian Bridge Supremum

The distribution of sup|B(t)| is the Kolmogorov distribution:

P(sup0t1B(t)x)=12j=1(1)j1e2j2x2P\left(\sup_{0 \leq t \leq 1} |B(t)| \leq x\right) = 1 - 2\sum_{j=1}^{\infty} (-1)^{j-1} e^{-2j^2 x^2}
6
Critical Values

The Kolmogorov distribution provides asymptotic critical values:

K0.051.36,K0.011.63(for nDn)K_{0.05} \approx 1.36, \quad K_{0.01} \approx 1.63 \quad \text{(for } \sqrt{n}D_n \text{)}

Example Application

Forn=100andDn=0.12:nDn=10(0.12)=1.2<1.36,failtorejectH0atα=0.05.For n=100 and Dₙ=0.12: √n·Dₙ = 10(0.12) = 1.2 < 1.36, fail to reject H₀ at α=0.05.

Practice Quiz

Test your understanding with 10 multiple-choice questions

Practice Quiz
10
Questions
0
Correct
0%
Accuracy
1
What is the primary advantage of nonparametric tests over parametric tests?
2
In the Sign Test for a median, if we have 15 observations with 11 positive differences, 3 negative differences, and 1 zero (which we discard), what is the test statistic N+N^+?
3
For the Wilcoxon Rank Sum Test comparing two groups with m=6 and n=8, what is the expected value of the rank sum W under the null hypothesis?
4
When should you use the Wilcoxon Signed Rank Test instead of the Sign Test?
5
In a Chi-Square Goodness-of-Fit Test with k=5 categories and no estimated parameters, the degrees of freedom is:
6
The Kolmogorov-Smirnov test statistic DnD_n measures:
7
For a Chi-Square Independence Test with a 3×4 contingency table, the degrees of freedom is:
8
In the Run Test, if we observe too few runs compared to the expected value, this suggests:
9
Which assumption is required for the Wilcoxon Rank Sum Test but NOT for the Sign Test?
10
For the Chi-Square Goodness-of-Fit Test, the rule of thumb is that expected frequency in each cell should be at least:

Frequently Asked Questions

Common questions about nonparametric hypothesis testing

What is Nonparametric Testing and why use it?
Nonparametric tests are statistical testing methods that do not depend on specific distribution forms (such as normal distribution). When data doesn't meet parametric test assumptions (normality, homogeneity of variance), or when data is ordinal rather than interval-scaled, nonparametric tests are the only choice. Although their power is usually slightly lower than parametric tests, they are more robust.
Key Point: Distribution-free & Robust
What's the difference between Sign Test and Wilco xon Rank-Sum Test?
The Sign Test only uses the "sign" information of data (greater or less than median), discarding specific numerical magnitudes, thus having lower efficiency. The Rank-Sum Test uses the "ranking" information of data, preserving more information, making it more efficient (higher power) in most cases than the Sign Test.
Comparison: Sign Test: only signs; Rank-Sum: rankings
What are typical null hypotheses in nonparametric tests?
Null hypotheses in nonparametric tests typically concern distribution shape or location. For example, the two-sample rank-sum test's null hypothesis is "the two population distributions are completely identical". Rejecting the null hypothesis indicates significant differences in location (median) or shape between distributions.
H0:F1(x)=F2(x)H_0: F_1(x) = F_2(x)
Why are nonparametric tests insensitive to outliers?
Because nonparametric tests are usually based on ranks rather than original values. An extreme outlier (like 1000) after ranking might just be "rank n", and its specific numerical magnitude doesn't affect the test statistic. This makes nonparametric tests very reliable for data with outliers.
Example: Data:1,2,3,1000>Ranks:1,2,3,4Data: 1, 2, 3, 1000 -> Ranks: 1, 2, 3, 4
What are the limitations of Chi-Square Goodness-of-Fit Test?
The Chi-Square test requires sufficiently large sample size, typically requiring expected frequency of at least 5 in each cell. If expected frequencies are too small, chi-square approximation fails. Additionally, the test is sensitive to data binning methods - different groupings may lead to different conclusions.
Key Point: Expected frequency >= 5
What advantages does Kolmogorov-Smirnov (K-S) test have over Chi-Square?
The K-S test is directly based on the empirical distribution function (EDF) and doesn't require data binning, thus avoiding information loss and subjective grouping decisions. It's particularly effective for continuous distributions and very sensitive to shape differences. However, it's generally only applicable to continuous data.
Comparison: K-S: no binning needed, for continuous data
Is nonparametric test power always lower than parametric tests?
Usually yes. If data truly follows normal distribution, the t-test is most efficient, and nonparametric tests lose some information (about 5% efficiency loss). However, if data is heavily skewed or has heavy tails, nonparametric test power may actually be **higher** than parametric tests.
Key Point: Efficiency trade-off
When should we use the Run Test?
The Run Test is mainly used to detect data randomness. If you suspect there's some trend, periodicity, or autocorrelation in the data collection process (e.g., positive and negative values alternate too frequently or too rarely), the Run Test can help determine if the sample was randomly drawn.
Example: Sequence:++++(Notrandom)Sequence: + + - - + + - - (Not random)
Ask AI ✨
MathIsimple – Simple, Friendly Math Tools & Learning