Question
For random variables , define
(1) Prove that defines a distance (metric) on the space of random variables. That is, show that satisfies positive definiteness, symmetry, and the triangle inequality.
(2) Prove that the random variables converge to in probability if and only if .
Step-by-step solution
Step 1. We first prove that satisfies the definition of a metric. (i) Non-negativity and positive definiteness: Since , the integrand satisfies , so after taking expectations. If , then the non-negative random variable has expectation zero, which implies that it equals zero almost surely. Consequently almost surely, i.e., a.s. (ii) Symmetry: Since , we have . (iii) Triangle inequality: Consider the function . For , its derivative is , so is strictly increasing. For any real numbers , the absolute value inequality gives . Let and , so that . Using the monotonicity of together with the inequality , we obtain: . Taking expectations on both sides yields . In summary, defines a metric on the space of random variables (where equality is understood in the almost sure sense).
Step 2. Proof of sufficiency: If , then . For any given , consider the event . On this event, by the monotonicity of , we have . By a generalization of Chebyshev's inequality (or directly from properties of expectation): . Rearranging gives . As , since , it follows that . Hence converges to in probability.
Step 3. Proof of necessity: If , then . For any given , we split the expectation into two parts: . For the first term, the integrand is always bounded above by 1, so the first term is at most . For the second term, when , we have , so the second term is at most . Thus . Since , as we have . Therefore . Since was arbitrary, we conclude .
Final answer
QED.
Marking scheme
The following is the scoring rubric based on the official solution (maximum 7 points).
I. Checkpoints (Total max 7)
Part 1: Proving that is a metric (max 2 pts)
- Positive definiteness and symmetry [additive]
- State that and a.s. (almost sure equality), and briefly justify symmetry .
- 1 pt
- Triangle inequality [additive]
- Use the monotonicity or subadditivity of the function (i.e., ) to derive .
- *If only the triangle inequality formula is stated without proving the key algebraic inequality, no credit is awarded.*
- 1 pt
Part 2: Proving the equivalence with convergence in probability (max 5 pts)
- Sufficiency: [additive]
- Use Chebyshev's inequality (or Markov's inequality) to establish the connection between and .
- Core argument: obtain , or note that on the event the integrand has the lower bound .
- 2 pts
- Necessity:
- Score exactly one chain; take the maximum subtotal among chains; do not add points across chains.
- `Chain A (Truncation/decomposition method)`
- Decompose the expectation: Split into integrals over the regions and (or a similar approach). [1 pt]
- Bounding and taking limits: Correctly bound both parts (the first part , the second part ), and let to show the limit is 0. [2 pts]
- `Chain B (Convergence theorem method)`
- Transfer of convergence in probability: Note that . [1 pt]
- Cite a theorem: Invoke the Dominated Convergence Theorem (DCT) (dominated by 1) or the Bounded Convergence Theorem to conclude . [2 pts]
Total (max 7)
II. Zero-credit items
- Merely copying the definition of the metric or the definition of convergence in probability from the problem statement without performing any derivation.
- In proving the triangle inequality, simply asserting that implies the corresponding inequality for expectations, without addressing the effect of the denominator .
- In Part 2, merely stating that "convergence in expectation implies convergence in probability" or vice versa, without providing a proof specific to the nonlinear metric .
III. Deductions
- Omitting almost sure equality (a.s.): In proving positive definiteness, if it is not stated that holds only in the "almost sure" sense (or ), deduct 1 point.
- Confusing convergence concepts: In the necessity proof, if pointwise convergence or almost sure convergence of is erroneously assumed in order to directly interchange limit and integral, without mentioning subsequences or properties of convergence in probability, deduct 1 point (in Chain B, if DCT is used, the version under convergence in probability must be made explicit or the subsequence principle must be invoked; otherwise this is treated as a logical gap).
- Circular reasoning: If the conclusion to be proved is used as a premise in the proof (e.g., directly using the fact that is a metric to prove the convergence property), that part receives 0 points.