MathIsimple

Probability Theory – Problem 8: Determine the value of

Question

A random variable XX has probability density function p(x)=cxe(lnx)22p(x)=\frac{c}{x}e^{-\frac{(\ln x)^{2}}{2}}, x>0x>0. (1) Determine the value of cc. (2) Find the distribution of lnX\ln X. (3) Discuss whether (X, lnX)(X,\ \ln X) is a continuous random vector.

Step-by-step solution

Step 1. Set up the normalisation integral. The given density is p(x)=cxe(lnx)22p(x)=\dfrac{c}{x}e^{-\dfrac{(\ln x)^2}{2}}, x>0x>0. A necessary condition for this to be a probability density function is 0+p(x)dx=1.\int_{0}^{+\infty}p(x)\,dx=1.

Substituting p(x)p(x) yields 0+cxe(lnx)22dx=1.\int_{0}^{+\infty}\dfrac{c}{x}e^{-\dfrac{(\ln x)^2}{2}}\,dx=1.

Step 2. Evaluate the integral by substitution.

Let t=lnxt=\ln x, so that x=etx=e^t and dx=etdtdx=e^t dt. Then dxx=etdtet=dt.\dfrac{dx}{x}=\dfrac{e^t dt}{e^t}=dt.

As xx ranges from 00 to ++\infty, t=lnxt=\ln x ranges from -\infty to ++\infty. Hence

0+cxe(lnx)22dx=c0+1xe(lnx)22dx=c+et22dt.\int_{0}^{+\infty}\dfrac{c}{x}e^{-\dfrac{(\ln x)^2}{2}}\,dx =c\int_{0}^{+\infty}\dfrac{1}{x}e^{-\dfrac{(\ln x)^2}{2}}\,dx =c\int_{-\infty}^{+\infty}e^{-\dfrac{t^2}{2}}\,dt.

It is well known that +et22dt=2π.\int_{-\infty}^{+\infty}e^{-\dfrac{t^2}{2}}\,dt=\sqrt{2\pi}.

Therefore the normalisation condition becomes c2π=1.c\sqrt{2\pi}=1.

Step 3. Solve for the constant cc.

c=12π.c=\dfrac{1}{\sqrt{2\pi}}.

Hence the value of cc is 12π\dfrac{1}{\sqrt{2\pi}}.

Step 1. Obtain the density via the change-of-variable formula.

Let Y=lnXY=\ln X and write g(x)=lnxg(x)=\ln x, whose inverse is x=eyx=e^y. Since X>0X>0, the range of YY is the entire real line, i.e. y(,+)y\in(-\infty,+\infty).

For a continuous random variable, the one-dimensional invertible transformation formula states: if Y=g(X)Y=g(X) and gg is monotone and invertible, then fY(y)=fX(x)dxdyx=g1(y).f_Y(y)=f_X(x)\left|\dfrac{dx}{dy}\right|\bigg|_{x=g^{-1}(y)}.

Here x=eyx=e^y and dxdy=ey\dfrac{dx}{dy}=e^y, so fY(y)=fX(ey)ey.f_Y(y)=f_X(e^y)\cdot e^y.

Step 2. Substitute the known density fXf_X and simplify.

From Part (1), fX(x)=12π1xe(lnx)22,x>0.f_X(x)=\dfrac{1}{\sqrt{2\pi}}\dfrac{1}{x}e^{-\dfrac{(\ln x)^2}{2}},\quad x>0.

Setting x=eyx=e^y gives

fY(y)=12π1eye(lney)22×ey=12πey22,yR.f_Y(y)=\dfrac{1}{\sqrt{2\pi}}\dfrac{1}{e^y}e^{-\dfrac{(\ln e^y)^2}{2}}\times e^y =\dfrac{1}{\sqrt{2\pi}}e^{-\dfrac{y^2}{2}},\quad y\in\mathbb{R}.

This is precisely the probability density function of the standard normal distribution.

Step 3. State the conclusion.

Therefore Y=lnXY=\ln X follows the standard normal distribution, i.e. lnXN(0,1).\ln X\sim \text{N}(0,1).

Equivalently, the distribution function is P(lnXy)=Φ(y),yR,P(\ln X\le y)=\Phi(y),\quad y\in\mathbb{R}, where Φ(y)\Phi(y) denotes the standard normal distribution function.

Step 1. Recall the definition of a continuous two-dimensional random vector.

A random vector (U,V)(U,V) is called a continuous random vector if there exists a non-negative integrable function fU,V(u,v)f_{U,V}(u,v) such that for every measurable set BR2B\subset\mathbb{R}^2, P((U,V)B)=BfU,V(u,v)dudv,P\big((U,V)\in B\big)=\iint_B f_{U,V}(u,v)\,du\,dv, and R2fU,V(u,v)dudv=1.\iint_{\mathbb{R}^2} f_{U,V}(u,v)\,du\,dv=1.

Under this definition, if fU,Vf_{U,V} exists, then for any set BB of planar Lebesgue measure zero, P((U,V)B)=0.P\big((U,V)\in B\big)=0.

Step 2. Analyse the support of (X,lnX)(X,\ln X).

Set Y=lnXY=\ln X. Then P(Y=lnX)=1,P\big(Y=\ln X\big)=1, meaning the random vector (X,Y)(X,Y) satisfies y=lnxy=\ln x almost surely.

Consequently, the values of (X,Y)(X,Y) are almost surely contained in the set A={(x,y)R2: x>0, y=lnx},A=\big\{(x,y)\in\mathbb{R}^2:\ x>0,\ y=\ln x\big\}, i.e. P((X,Y)A)=1.P\big((X,Y)\in A\big)=1.

However, AA is a smooth curve in the plane; as a one-dimensional curve in R2\mathbb{R}^2, its two-dimensional Lebesgue measure is zero.

Step 3. Compare with the definition to reach the conclusion.

If (X,Y)(X,Y) were a continuous random vector, then for any set BB of Lebesgue measure zero we would have P((X,Y)B)=0P\big((X,Y)\in B\big)=0. Yet here there exists a set AA of measure zero with P((X,Y)A)=1P\big((X,Y)\in A\big)=1.

This contradicts the properties of a continuous random vector. Therefore: (X, lnX)(X,\ \ln X) is not a continuous random vector.

Final answer

1. c=12πc=\dfrac{1}{\sqrt{2\pi}}. 2. lnXN(0,1)\ln X\sim \text{N}(0,1), with density flnX(y)=12πey22f_{\ln X}(y)=\dfrac{1}{\sqrt{2\pi}}e^{-\dfrac{y^2}{2}}. 3. The values of (X, lnX)(X,\ \ln X) are almost surely confined to the curve y=lnxy=\ln x, which violates the definition of a continuous random vector; therefore (X, lnX)(X,\ \ln X) is not a continuous random vector.

Marking scheme

The following is the complete marking scheme for this probability theory problem (full marks: 7 points).


1. Checkpoints (Total 7 pts)

Part 1: Determining the value of cc (2 points)

*Note: This part tests the ability to find a constant using the normalisation property of integrals.*

  • Setting up the integral and performing the substitution [1 pt]
  • State the normalisation condition 0+cxe(lnx)22dx=1\int_{0}^{+\infty} \frac{c}{x}e^{-\frac{(\ln x)^2}{2}} dx = 1 and carry out the substitution t=lnxt=\ln x (or dx/x=dtdx/x = dt), converting the integral into the Gaussian integral form +et2/2dt\int_{-\infty}^{+\infty} e^{-t^2/2} dt.
  • *Note: If the substitution is not shown explicitly but the student directly invokes the definition of the log-normal distribution and correctly identifies the normalisation constant, full credit may still be awarded.*
  • Computing the result [1 pt]
  • Correctly obtain c=12πc = \frac{1}{\sqrt{2\pi}}.

Part 2: Finding the distribution of lnX\ln X (3 points)

*Note: Grade one of the following paths only | scores across paths are not cumulative. This part tests distribution transformation of a function of a random variable.*

  • Path A: Density transformation method
  • Jacobian/derivative term [1 pt]: Write down the inverse x=eyx=e^y of the transformation y=lnxy=\ln x together with its derivative (or Jacobian determinant) dxdy=ey\frac{dx}{dy} = e^y.
  • Substitution and simplification [1 pt]: Correctly substitute x=eyx=e^y and the derivative into the density transformation formula fY(y)=p(ey)eyf_Y(y) = p(e^y) \cdot e^y and simplify to obtain cey2/2c e^{-y^2/2} or 12πey2/2\frac{1}{\sqrt{2\pi}} e^{-y^2/2}.
  • Final conclusion [1 pt]: Explicitly state that lnX\ln X follows the standard normal distribution N(0,1)N(0,1), or write out the complete standard normal probability density function (including the domain yRy\in\mathbb{R}).
  • Path B: Distribution function method
  • Definition and conversion [1 pt]: Write the distribution function FY(y)=P(lnXy)=P(Xey)=0eyp(x)dxF_Y(y) = P(\ln X \le y) = P(X \le e^y) = \int_{0}^{e^y} p(x) dx.
  • Integral substitution [1 pt]: Use the substitution t=lnxt=\ln x to convert the limits and integrand into the standard normal distribution function form ycet2/2dt\int_{-\infty}^{y} c e^{-t^2/2} dt.
  • Final conclusion [1 pt]: Identify the integral as the standard normal distribution; conclusion same as Path A.
  • Shared prerequisite [max 1 pt]: If the student made an error in Part 1 leading to an incorrect value of cc, but the reasoning in this part is entirely correct, only the result mark is deducted; process marks are retained (follow-through marking).

Part 3: Discussing whether (X,lnX)(X, \ln X) is a continuous random vector (2 points)

*Note: This part tests understanding of the definition of a two-dimensional continuous random vector.*

  • Identifying the support set / measure [1 pt]
  • Point out that the probability mass of the random vector (X,lnX)(X, \ln X) is concentrated on the plane curve y=lnxy = \ln x; or note that its support has two-dimensional Lebesgue measure zero.
  • Stating the conclusion [1 pt]
  • Based on the above reasoning (a set of measure zero carries probability one, or no joint density f(x,y)f(x,y) with respect to two-dimensional Lebesgue measure can exist), conclude that (X,lnX)(X, \ln X) is not a continuous random vector.

Total (max 7)


2. Zero-credit items

  • Copying the problem statement: Merely restating the given conditions or formula names (e.g. "use the density formula") without performing any concrete substitution or computation.
  • Unsupported guess in Part 3: Answering only "yes" or "no" in Part 3 without any mathematical justification.
  • Conceptual confusion: Arguing in Part 3 that "because the marginal distributions of XX and lnX\ln X are both continuous, the joint distribution is also continuous" (this conclusion is false; award 0 points).
  • Incorrect independence assumption: Attempting in Part 3 to construct a density via f(x,y)=fX(x)fY(y)f(x,y) = f_X(x) \cdot f_Y(y) (independence is not given, and the variables are in fact perfectly dependent; this approach receives 0 points).

3. Deductions

*Apply at most one of the following deductions (whichever is most severe); the total score cannot fall below 0.*

  • Computational/arithmetic error (-1 pt):
  • Errors in handling constants during integration (e.g. omitting 2π\sqrt{2\pi}, miscalculating coefficients), leading to an incorrect value of cc or incorrect distribution parameters.
  • Missing domain (-1 pt):
  • When writing the probability density function as a final answer, failing to specify the range of the variable (e.g. yRy \in \mathbb{R} or <y<+-\infty < y < +\infty). This deduction is waived if the student explicitly writes "follows the standard normal distribution" in words.
  • Logical/notational confusion (-1 pt):
  • Severely confusing the random variable notation (uppercase XX) with the realisation notation (lowercase xx), or writing logically incoherent integration limits (e.g. 00 to lnx\ln x) rendering the mathematical expression meaningless.
Ask AI ✨