Represent paired data, identify positive/negative or no association, and model relationships with linear functions for prediction.
Study time (hours): 1,2,3,4; Scores : 45,50,55,60. Find model and predict when .
Slope , intercept → model:
Prediction at →
Residual of a point : . Positive residual means the model underpredicts.
Least squares line minimizes . Its slope/intercept can be computed (in advanced courses) from summary stats.
Residual plot: random scatter around 0 suggests a linear model is appropriate; patterns suggest nonlinearity or heteroscedasticity.
Correlation coefficient measures linear association strength (−1 ≤ r ≤ 1). Sign gives direction; |r| close to 1 indicates strong linear association.
Coefficient of determination : fraction of variability in explained by the model; for simple linear regression.
Contextual interpretation: “About of the variation in y can be explained by x via the linear model.”
Using , compute residuals for points (1,45), (2,50), (3,60).
(1,45):
(2,50):
(3,60): (model underpredicts)
If , then : about 81% of score variation is explained by study time.
1) For points (0,2), (2,6), (4,10), find linear model.
2) Given model , compute residual for (x,y)=(4,11).
3) Explain why extrapolating from x∈[1,4] to x=20 can be risky.