MathIsimple
Lesson 4.1: Describing & Analyzing Single-Variable Data

See the Story Behind the Numbers

Use visual displays and summary statistics to understand center, spread, and shape. Identify clusters, gaps, and outliers.

Learning Objectives

  • Create and interpret dot plots, histograms, and box plots
  • Compute mean, median, range, IQR, and standard deviation
  • Describe distribution shape (symmetric, right/left skewed)
  • Identify clusters and outliers

Visualizations

Dot Plot

Marks each data value; great for small datasets.

Histogram

Bins data into intervals; shows shape and spread.

Box Plot

Displays Q1, median, Q3, IQR, and potential outliers.

Worked Example (Box Plot)

Scores: 85, 78, 92, 88, 90, 75, 82, 86, 95, 80. Find median and IQR.

Sorted: 75, 78, 80, 82, 85, 86, 88, 90, 92, 95

Median = (85+86)/2=85.5(85+86)/2=85.5

Q1 = 80, Q3 = 90 → IQR = Q3Q1=10Q_3-Q_1=10

Quartiles, Outliers, and Robustness

Quartiles: Q1 is median of lower half, Q3 is median of upper half (exclude the overall median when n is odd).

Outlier rule (Tukey): values outside [Q11.5IQR, Q3+1.5IQR][Q_1-1.5\,IQR,\ Q_3+1.5\,IQR] are potential outliers.

Robust statistics: median and IQR are resistant to extreme values; mean and standard deviation are not.

Standard Deviation and z-score

Sample SD: s=(xixˉ)2n1s=\sqrt{\dfrac{\sum (x_i-\bar{x})^2}{n-1}}; Population SD: σ=(xiμ)2n\sigma=\sqrt{\dfrac{\sum (x_i-\mu)^2}{n}}.

z-score: z=xxˉsz=\dfrac{x-\bar{x}}{s} measures how many SDs a point is from the mean.

Chebyshev (any distribution): proportion within k SDs is at least 11k21-\dfrac{1}{k^2} for k>1k>1.

Advanced Worked Examples

A) Outlier Detection with IQR

Data (sorted): 4, 5, 6, 7, 8, 12, 13, 14, 30

Median = 8; Q1 = 6; Q3 = 13 → IQR = 7

Fence: [61.57, 13+1.57]=[4.5, 23.5][6-1.5\cdot7,\ 13+1.5\cdot7]=[-4.5,\ 23.5]

30 is outside → potential outlier

B) z-score Interpretation

Class scores: xˉ=78\bar{x}=78, s=6s=6. Alice scored 90, Bob scored 68. Compare relative standing.

Alice: z=90786=2z=\dfrac{90-78}{6}=2 (2 SDs above mean)

Bob: z=68786=1.67z=\dfrac{68-78}{6}=-1.67 (1.67 SDs below mean)

C) Chebyshev Guarantee

At least what proportion of data lie within 3 SDs of the mean?

1132=890.88891-\dfrac{1}{3^2}=\tfrac{8}{9}\approx0.8889

Practice

1) Compute mean, median, IQR, and SD for [2, 4, 6, 8].

Solution
Sorted same; median 5; Q1=3, Q3=7 → IQR=4; xˉ=5\bar{x}=5, s=5s=\sqrt{5}

2) For Q1=12, Q3=20, apply IQR rule. Is 35 an outlier?

Solution
IQR=8; fence [12-12, 20+12]=[0,32]; 35>32 → outlier

3) A value has z=2.5 (using class mean & SD). Interpret in context.

Sample Answer
It is 2.5 SDs above average — unusually high relative performance.