MathIsimple

Probability Theory – Problem 5: Find

Question

Consider a biased coin that lands heads with probability pp and tails with probability 1p1-p, where the outcomes are recorded as ``HH'' and ``TT'' respectively. The coin is tossed repeatedly. Let NHHN_{HH} denote the number of tosses required for the pattern ``HHHH'' to appear for the first time. Find E[NHH],Var(NHH)\mathbb{E}[N_{HH}],\mathrm{Var} (N_{HH}). (Hint: There are two approaches. One is to compute conditional expectations and then apply the law of total expectation to calculate E[NHH]\mathbb{E}[N_{HH}] and E[NHH2]\mathbb{E}[N_{HH}^2] separately, thereby obtaining the result. The other approach is to show that NHN_{H} and NHHHN_{HH|H} are independent, and then construct the generating function of NHHHNHN_{HH|H} - N_{H} to solve the problem.)

Step-by-step solution

Let NHHN_{HH} denote the number of tosses until two consecutive heads (HH) first appear. The coin lands heads with probability pp and tails with probability 1p1-p. Let E=E[NHH]E = E[N_{HH}]. Consider the outcome of the first toss. If the first toss is tails (T), which occurs with probability 1p1-p, then one toss has been used but no progress has been made toward the goal "HH"; the process effectively restarts from scratch. Hence the expected total number of tosses in this case is 1+E1+E. If the first toss is heads (H), which occurs with probability pp, then one toss has been used and one step of progress has been made. We now consider the outcome of the second toss. If the second toss is heads (H), which occurs with probability pp, then "HH" has appeared and the process terminates. The total number of tosses is 22. If the second toss is tails (T), which occurs with probability 1p1-p, then the pattern "HT" has occurred and the process has not terminated. Two tosses have been used, but the accumulated "H" has been broken, so the process restarts from scratch. The expected total number of tosses in this case is 2+E2+E. By the law of total expectation, EE equals the weighted average of the expected values over all cases: E=(1p)(1+E)+p[p2+(1p)(2+E)]E = (1-p) \cdot (1+E) + p \cdot [p \cdot 2 + (1-p) \cdot (2+E)] Solving for EE: E=1p+(1p)E+2p2+(2p2p2)+(pp2)EE = 1 - p + (1-p)E + 2p^2 + (2p-2p^2) + (p-p^2)E E=1+pp2E = \frac{1+p}{p^2} Therefore, E[NHH]=1+pp2E[N_{HH}] = \frac{1+p}{p^2}. Let S=E[NHH2]S = E[N_{HH}^2]. Using the law of total expectation, we decompose NHH2N_{HH}^2 as follows: If the first toss is tails (T), the total number of tosses is 1+N1+N', where NN' is the number of additional tosses from a fresh start until "HH" appears, having the same distribution as NHHN_{HH}. Taking the expectation of the square gives E[(1+N)2]=E[1+2N+(N)2]=1+2E[N]+E[(N)2]=1+2E+SE[(1+N')^2] = E[1+2N'+(N')^2] = 1+2E[N'] + E[(N')^2] = 1+2E+S. If the first toss is heads (H), we consider the second toss: If the second toss is heads (H), the total number of tosses is 22, and its square is 44. If the second toss is tails (T), the total number of tosses is 2+N2+N'', where NN'' has the same distribution as NHHN_{HH}. Taking the expectation of the square gives E[(2+N)2]=E[4+4N+(N)2]=4+4E[N]+E[(N)2]=4+4E+SE[(2+N'')^2] = E[4+4N''+(N'')^2] = 4+4E[N''] + E[(N'')^2] = 4+4E+S. Combining the above cases yields the equation for SS: S=(1p)(1+2E+S)+p[p4+(1p)(4+4E+S)]S = (1-p) \cdot (1+2E+S) + p \cdot [p \cdot 4 + (1-p) \cdot (4+4E+S)] Expanding and collecting terms in SS: S=(1p)(1+2E)+(1p)S+4p2+p(1p)(4+4E)+p(1p)SS = (1-p)(1+2E) + (1-p)S + 4p^2 + p(1-p)(4+4E) + p(1-p)S S(1p2)S=1+3p+(2+2p4p2)ES - (1-p^2)S = 1+3p + (2+2p-4p^2)E p2S=1+3p+2(1+p2p2)Ep^2 S = 1+3p + 2(1+p-2p^2)E Substituting the previously obtained E=1+pp2E = \frac{1+p}{p^2}: p2S=1+3p+2(1+p2p2)1+pp2p^2 S = 1+3p + 2(1+p-2p^2)\frac{1+p}{p^2} Multiplying both sides by p2p^2 to clear the denominator: p4S=p2(1+3p)+2(1+p2p2)(1+p)p^4 S = p^2(1+3p) + 2(1+p-2p^2)(1+p) p4S=p3p2+4p+2p^4 S = -p^3 - p^2 + 4p + 2 Therefore, E[NHH2]=S=p3p2+4p+2p4E[N_{HH}^2] = S = \frac{-p^3 - p^2 + 4p + 2}{p^4}. Computing the variance via Var(NHH)=E[NHH2](E[NHH])2Var(N_{HH}) = E[N_{HH}^2] - (E[N_{HH}])^2: Var(NHH)=p3p2+4p+2p4(1+pp2)2Var(N_{HH}) = \frac{-p^3 - p^2 + 4p + 2}{p^4} - \left(\frac{1+p}{p^2}\right)^2 Var(NHH)=p32p2+2p+1p4Var(N_{HH}) = \frac{-p^3 - 2p^2 + 2p + 1}{p^4}

Final answer

The expectation is E[NHH]=1+pp2E[N_{HH}] = \frac{1+p}{p^2}. The variance is Var(NHH)=1+2p2p2p3p4Var(N_{HH}) = \frac{1+2p-2p^2-p^3}{p^4}.

Marking scheme

Below is the rubric tailored to this probability theory problem. The official solution is taken as fully correct.


1. Checkpoints (max 7 pts total)

Score exactly one chain; take the maximum subtotal among chains. Do not add points across chains.

Chain A: Conditional Expectation / System of Equations (Official Solution Path)

  • Setting up the expectation equation [2 pts] [Additive]: Correctly formulate the recursive equation for E=E[NHH]E = \mathbb{E}[N_{HH}].
  • For example: E=(1p)(1+E)+p[p2+(1p)(2+E)]E = (1-p)(1+E) + p[p \cdot 2 + (1-p)(2+E)], or an equivalent form obtained by expanding and simplifying.
  • *Note: If the equation has the correct structure (reflecting the ideas of resetting and step accumulation) but contains minor coefficient errors, award 1 pt.*
  • Solving for the expectation [1 pt] [Additive]: Correctly obtain E=1+pp2E = \frac{1+p}{p^2}.
  • Setting up the second moment equation [2 pts] [Additive]: Correctly formulate the recursive equation for S=E[NHH2]S = \mathbb{E}[N_{HH}^2].
  • Key requirement: Must correctly expand the expectation of the squared term E[(N+k)2]=E[N2]+2kE[N]+k2\mathbb{E}[(N+k)^2] = \mathbb{E}[N^2] + 2k\mathbb{E}[N] + k^2 (i.e., reflecting S+2kE+k2S + 2kE + k^2).
  • *If the cross term 2kE2kE is omitted (i.e., incorrectly writing E[(N+k)2]=S+k2\mathbb{E}[(N+k)^2] = S + k^2), this item and all subsequent items receive no credit.*
  • Solving for the second moment [1 pt] [Additive]: Substitute EE and correctly obtain S=p3p2+4p+2p4S = \frac{-p^3 - p^2 + 4p + 2}{p^4} (or an equivalent correct intermediate expression).
  • Final variance result [1 pt] [Additive]: Combine using Var=SE2\mathrm{Var} = S - E^2 and simplify algebraically to obtain the final result p32p2+2p+1p4\frac{-p^3 - 2p^2 + 2p + 1}{p^4}.

Chain B: Probability Generating Function (PGF / Generating Functions)

  • Setting up the PGF equation [2 pts] [Additive]: Correctly establish the self-referential equation for the generating function G(z)=E[zN]G(z) = \mathbb{E}[z^N].
  • For example: G(z)=(1p)zG(z)+p(1p)z2G(z)+p2z2G(z) = (1-p)zG(z) + p(1-p)z^2G(z) + p^2z^2.
  • Closed-form PGF expression [1 pt] [Additive]: Correctly solve for the explicit form of G(z)G(z), e.g., G(z)=p2z21(1p)zp(1p)z2G(z) = \frac{p^2z^2}{1 - (1-p)z - p(1-p)z^2}.
  • Expectation computation [1 pt] [Additive]: Correctly compute E[NHH]=1+pp2\mathbb{E}[N_{HH}] = \frac{1+p}{p^2} via G(1)G'(1).
  • Second derivative / second moment computation [2 pts] [Additive]: Correctly compute G(1)G''(1) or E[N(N1)]\mathbb{E}[N(N-1)] from G(z)G(z), showing a correct and complete calculus derivation.
  • Final variance result [1 pt] [Additive]: Combine the above results to correctly compute Var(NHH)=p32p2+2p+1p4\mathrm{Var}(N_{HH}) = \frac{-p^3 - 2p^2 + 2p + 1}{p^4}.

Total (max 7)


2. Zero-credit items

  • Merely stating standard geometric distribution formulas (such as 1/p1/p or (1p)/p2(1-p)/p^2) without any derivation specific to the "HH" pattern.
  • Merely writing down the definition of expectation or variance (such as xP(x)\sum x P(x)) without any concrete substitution or computation for this problem.
  • Stating the answer without proof, and the answer is incorrect.

3. Deductions

  • Arithmetic error [-1]: After setting up the correct equation, a simple algebraic or sign error leads to an incorrect final result (deduct 1 pt per error, but at most 1 pt per Chain).
  • Logic gap [Cap at 3/7]: In Chain A, if the cross term arising from expanding the square is omitted when computing the second moment SS, this constitutes a core logical error; that part and the subsequent variance part receive no credit (score capped at 3 pts for the expectation portion).
  • Unsimplified result [No Penalty]: If the final result is in a correct form (e.g., expressed as a difference of fractions) but not reduced to a single fraction as in the official solution, no deduction is applied, provided the expressions are mathematically equivalent.
Ask AI ✨