Probability Theory and Its Applications

In this section we define the notion of convergence in distribution of \(a\) sequence of random variables \(Z_{1}, Z_{2}, \ldots, Z_{n}\) to a random variable \(Z\) , which is the notion of convergence most used in applications of probability theory. The notion of convergence in distribution of a sequence of random variables can be defined in a large number of equivalent ways, each of which is important for certain purposes. Instead of choosing any one of them as the definition, we prefer to introduce all the equivalent concepts simultaneously.

Theorem 3A. Definitions and Theorems Concerning Convergence in Distribution. For \(n=1,2, \ldots\) , let \(Z_{n}\) be a random variable with distribution function \(F_{Z_{n}}(\cdot)\) and characteristic function \(\phi_{Z_{n}}(\cdot)\) . Similarly, let \(Z\) be a random variable with distribution function \(F_{Z}(\cdot)\) and characteristic function \(\phi_{Z}(\cdot)\) . We define the sequence \(\left\{Z_{n}\right\}\) as converging in distribution to the random variable \(Z\) , denoted by

\[\lim _{n \rightarrow \infty} \mathscr{L}\left(Z_{n}\right)=\mathscr{L}(Z), \quad \text { or } \quad \mathscr{L}\left(Z_{n}\right) \rightarrow \mathscr{L}(Z), \tag{3.1}\]

and read “the law of \(Z_{n}\) converges to the law of \(Z\) ” if any one (and consequently all) of the following equivalent statements holds:

(i) For every bounded continuous function \(g(\cdot)\) of a real variable there is convergence of the expectation \(E\left[g\left(Z_{n}\right)\right]\) to \(E[g(Z)]\) ; that is, as \(n\) tends to \(\infty\) ,

\[E\left[g\left(Z_{n}\right)\right]=\int_{-\infty}^{\infty} g(z) d F_{Z_{n}}(z) \rightarrow \int_{-\infty}^{\infty} g(z) d F_{Z}(z)=E[g(Z)]. \tag{3.2}\]

(ii) At every real number \(u\) there is convergence of the characteristic functions; that is, as \(n\) tends to \(\infty\) ,

\[E\left[e^{i u Z_{n}}\right]=\phi_{Z_{n}}(u) \rightarrow \phi_{Z}(u)=E\left[e^{i u Z}\right]. \tag{3.3}\]

(iii) At every two points \(a\) and \(b\) , where \(a, at which the distribution function \(F_{Z}(\cdot)\) of the limit random variable \(Z\) is continuous, there is convergence of the probability functions over the interval \(a\) to \(b\) ; that is, as \(n\) tends to \(\infty\) ,

\[P\left[a

(iv) At every real number \(a\) that is a point of continuity of the distribution function \(F_{Z}(\cdot)\) there is convergence of the distribution functions; that is, as \(n\) tends to \(\infty\) , if \(a\) is a continuity point of \(F_{Z}(\cdot)\) ,

\[P\left[Z_{n} \leq a\right]=F_{Z_{n}}(a) \rightarrow F_{Z}(a)=P[Z \leq a]\]

(v) For every continuous function \(g(\cdot)\) , as \(n\) tends to \(\infty\) ,

\[P_{Z_{n}}[\{z: \quad g(z) \leq y\}]=F_{g\left(Z_{n}\right)}(y) \rightarrow F_{g(Z)}(y)=P_{Z}[\{z: \quad g(z) \leq y\}]\]

at every real number \(y\) at which the distribution function \(F_{g(Z)}(\cdot)\) is continuous.

Let us indicate briefly the significance of the most important of these statements. The practical meaning of convergence in distribution is expressed by (iii); the reader should compare the statement of the central limit theorem in section 5 of Chapter 8 to see that (iii) constitutes an exact mathematical formulation of the assertion that the probability law of \(Z\) “approximates” that of \(Z_{n}\) . From the point of view of establishing in practice that a sequence of random variables converges in distribution, one uses (ii), which constitutes a criterion for convergence in distribution in terms of characteristic functions. Finally, (v) represents a theoretical fact of the greatest usefulness in applications, for it asserts that if \(Z_{n}\) converges in distribution to \(Z\) then a sequence of random variables \(g\left(Z_{n}\right)\) , obtained as functions of the \(Z_{n}\) , converges in distribution to \(g(Z)\) if the function \(g(\cdot)\) is continuous.

We defer the proof of the equivalence of these statements to section 5.

The Continuity Theorem of Probability Theory. The inversion formulas of section 3 of Chapter 9 prove that there is a one-to-one correspondence between distribution and characteristic functions; given a distribution function \(F(\cdot)\) and its characteristic function

\[\phi(u)=\int_{-\infty}^{\infty} e^{i u x} d F(x), \tag{3.5}\]

there is no other distribution function of which \(\phi(\cdot)\) is the characteristic function. The results stated in theorem 3A show that the one-to-one correspondence between distribution and characteristic functions, regarded as a transformation between functions, is continuous in the sense that a sequence of distribution functions \(F_{n}(\cdot)\) converges to a distribution function \(F(\cdot)\) at all points of continuity of \(F(\cdot)\) if and only if the sequence of characteristic functions

\[\phi_{n}(u)=\int_{-\infty}^{\infty} e^{i u x} d F_{n}(x) \tag{3.6}\]

converges at each real number \(u\) to the characteristic function \(\phi(\cdot)\) of \(F(\cdot)\) . Consequently, theorem 3A is often referred to as the continuity theorem of probability theory .

Theorem 3A has the following extremely important extension, of which the reader should be aware. Suppose that the sequence of characteristic functions \(\phi_{n}(\cdot)\) , defined by (3.6), has the property of converging at all real \(u\) to a function \(\phi(\cdot)\) , which is continuous at \(u=0\) . It may be shown that there is then a distribution function \(F(\cdot)\) , of which \(\phi(\cdot)\) is the characteristic function . In view of this fact, the continuity theorem of probability theory is sometimes formulated in the following way:

Consider a sequence of distribution functions \(F_{n}(x)\) , with characteristic functions \(\phi_{n}(u)\) , defined by (3.6). In order that a distribution function \(F(\cdot)\) exist such that \[\lim _{n \rightarrow \infty} F_{n}(x)=F(x)\] at all points \(x\) , which are continuity points of \(F(x)\) , it is necessary and sufficient that a function \(\phi(u)\) , continuous at \(u=0\) , exist such that \[\lim _{n \rightarrow \infty} \phi_{n}(u)=\phi(u) \quad \text { at all real } u.\]

Expansions for the Characteristic Function. In the use of characteristic functions to prove theorems concerning convergence in distribution, a major role is played by expansions for the characteristic function, and for the logarithm of the characteristic function, of a random variable such as those given in lemmas 3A and 3B. Throughout this chapter we employ this convention regarding the use of the symbol \(\theta\) . The symbol \(\theta\) is used to describe any real or complex valued quantity satisfying the inequality \(|\theta| \leq 1\) . It is to be especially noted that the symbol \(\theta\) does not denote the same number each time it occurs, but only that the number represented by it has modulus less than 1.

Lemma 3A. Let \(X\) be a random variable whose mean \(E[X]\) exists and is equal to 0 and whose variance \(\sigma^{2}[X]=E\left[X^{2}\right]\) is finite. Then (i) for any \(u\)

\[\phi_{X}(u)=1-\frac{1}{2} u^{2} E\left[X^{2}\right]-u^{2} \int_{0}^{1} d t(1-t) E\left[X^{2}\left(e^{i u t X}-1\right)\right]; \tag{3.7}\]

(ii) for any \(u\) such that \(3 u^{2} E\left[X^{2}\right] \leq 1, \log \phi_{X}(u)\) exists and satisfies \[\log \phi_{X}(u)=-\frac{1}{2} u^{2} E\left[X^{2}\right]-u^{2} \int_{0}^{1} d t(1-t) E\left[X^{2}\left(e^{i u t X}-1\right)\right] +3 \theta u^{4} E^{2}\left[X^{2}\right] \tag{3.8}\] for some number \(\theta\) such that \(|\theta| \leq 1\) . Further, if the third absolute moment \(E\left[|X|^{3}\right]\) is finite, then for \(u\) such that \(3 u^{2} E\left[X^{2}\right] \leq 1\)

\[\log \phi_{X}(u)=-\frac{1}{2} u^{2} E\left[X^{2}\right]+\frac{\theta}{6}|u|^{3} E\left[|X|^{3}\right]+3 \theta|u|^{4} E^{2}\left[X^{2}\right]. \tag{3.9}\]

Proof

Equation (3.7) follows immediately by integrating with respect to the distribution function of \(X\) the easily verified expansion

\[e^{i u x}=1+i u x-\frac{1}{2} u^{2} x^{2}-u^{2} x^{2} \int_{0}^{1} d t(1-t)\left(e^{i u t x}-1\right). \tag{3.10}\]

To show (3.8), we write [by (3.7)] that \(\log \phi_{X}(u)=\log (1-r)\) , in which

\[r=\frac{1}{2} u^{2} E\left[X^{2}\right]+u^{2} \int_{0}^{1} d t(1-t) E\left[X^{2}\left(e^{i u t x}-1\right)\right]. \tag{3.11}\]

Now \(|r| \leq 3 u^{2} E\left[X^{2}\right] / 2\) , so that \(|r| \leq \frac{1}{2}\) if \(u\) is such that \(3 u^{2} E\left[X^{2}\right] \leq 1\) . For any complex number \(r\) of modulus \(|r| \leq \frac{1}{2}\)

\begin{align} \log (1-r) & =-r \int_{0}^{1} \frac{1}{1-r t} dt, \\ \log (1-r)+r & =-r^{2} \int_{0}^{1} \frac{t}{1-r t} dt, \\ |\log (1-r)-(-r)| & \leq|r|^{2} \leq\left(\frac{9}{4}\right) u^{4} E^{2}\left[X^{2}\right] \end{align}

since \(|1-r t| \geq 1-|r t| \geq \frac{1}{2}\) . The proof of (3.8) is completed.

Finally, (3.9) follows immediately from (3.8), since

\[-u^{2} \int_{0}^{1} d t(1-t) E\left[X^{2}\left(e^{i u t X}-1\right)\right]=\frac{(i u)^{3}}{2} \int_{0}^{1} d t(1-t)^{2} E\left[X^{3} e^{i u t X}\right].\]

Lemma 3B. In the same way that (3.7) and (3.8) are obtained, one may obtain expansions for the characteristic function of a random variable \(Y\) whose mean \(E[Y]\) exists: \begin{align} \phi_{Y}(u) & =1+i u E[Y]+i u \int_{0}^{1} d t E\left[Y\left(e^{i u t Y}-1\right)\right] \tag{3.13} \\ \log \phi_{Y}(u) & =i u E[Y]+i u \int_{0}^{1} d t E\left[Y\left(e^{i u t Y}-1\right)\right]+9 \theta u^{2} E^{2}[|Y|] \end{align} for \(u\) such that \(6|u| E[|Y|] \leq 1\) .

Example 3A. Asymptotic normality of binomial random variables. In section 2 of Chapter 6 it is stated that a binomial random variable is approximately normally distributed. This assertion may be given a precise formulation in terms of the notion of convergence in distribution. Let \(S_{n}\) be the number of successes in \(n\) independent repeated Bernoulli trials, with probability \(p\) of success at each trial, and let

\[Z_{n}=\frac{S_{n}-E\left[S_{n}\right]}{\sigma\left[S_{n}\right]}=\frac{S_{n}-n p}{\sqrt{n p q}}. \tag{3.14}\]

Let \(Z\) be any random variable that is normally distributed with mean 0 and variance 1. We now show that the sequence \(\left\{Z_{n}\right\}\) converges in distribution to \(Z\) . To prove this assertion, we first write the characteristic function of \(Z_{n}\) in the form

\begin{align} \phi_{Z_{n}}(u) & =\exp [-i u(n p / \sqrt{n p q})] \phi_{S_{n}}\left(\frac{u}{\sqrt{n p q}}\right) \tag{3.15} \\ & =[q \exp (-i u \sqrt{p / n q})+p \exp (i u \sqrt{q / n p})]^{n}. \end{align} Therefore, \[\log \phi_{Z_{n}}(u)=n \log \phi_{X}(u), \tag{3.16}\]

where we define

\[\phi_{X}(u)=q \exp (-i u \sqrt{p / n q})+p \exp (i u \sqrt{q / n p}). \tag{3.17}\]

Now \(\phi_{X}(u)\) is the characteristic function of a random variable \(X\) with mean, mean square, and absolute third moment given by

\begin{align} E[X] & = q\left(-\sqrt{\frac{p}{nq}}\right) + p \sqrt{\frac{q}{np}} = 0, \\[3mm] E\left[X^2\right] & = q\left(-\sqrt{\frac{p}{nq}}\right)^2 + p\left(\sqrt{\frac{q}{np}}\right)^2 = \frac{p+q}{n} = \frac{1}{n}, \\[3mm] E\left[|X|^3\right] & = q\left|-\sqrt{\frac{p}{nq}}\right|^3 + p\left|\sqrt{\frac{q}{np}}\right|^3 = \frac{q^2 + p^2}{(n^3 pq)^{1/2}}. \end{align}

By (3.9), we have the expansion for \(\log \phi_{X}(u)\) , valid for \(u\) , such that \(3 u^{2} E\left[X^{2}\right]=3 u^{2} / n \leq 1\) :

\begin{align} \log \phi_{X}(u) & =-\frac{1}{2} u^{2} E\left[X^{2}\right]+\frac{\theta}{6}|u|^{3} E\left[|X|^{3}\right]+3 \theta|u|^{4} E^{2}\left[X^{2}\right] \tag{3.19} \\ & =-\frac{1}{2 n} u^{2}+\frac{\theta}{6} u^{3} \frac{q^{2}+p^{2}}{\left(n^{3} p q\right)^{1 / 2}}+3 \theta|u|^{4} \frac{1}{n^{2}}. \end{align}

in which \(\theta\) is some number such that \(|\theta| \leq 1\) .

In view of (3.16) and (3.19), we see that for fixed \(u \neq 0\) and for \(n\) so large that \(n \geq 3 u^{2}\) ,

\[\log \phi_{Z_{n}}(u)=-\frac{1}{2} u^{2}+\frac{\theta}{6} u^{3} \frac{q^{2}+p^{2}}{(n p q)^{1 / 2}}+\frac{3 \theta|u|^{4}}{n}, \tag{3.20}\]

which tends to \(\log \phi_{Z}(u)=-\frac{1}{2} u^{2}\) as \(n\) tends to infinity. By statement (ii) of theorem 3A, it follows that the sequence \(\left\{Z_{n}\right\}\) converges in distribution to \(Z\) .

Characteristic functions may be used to prove theorems concerning convergence in probability to a constant. In particular, the reader may easily verify the following lemma.

Lemma 3C. A sequence of random variables \(Z_{n}\) converges in probability to 0 if and only if it converges in distribution to 0, which is the case if and only if, for every real number \(u\) ,

\[\lim _{n \rightarrow \infty} \phi_{Z_{n}}(u)=1. \tag{3.21}\]

Theorem 3B. The law of large numbers for a sequence of independent, identically distributed random variables \(X_{1}, X_{2}, \ldots, X_{n}\) with common finite mean \(m\) . As \(n\) tends to \(\infty\) , the sample mean \((1 / n)\left(X_{1}+\cdots+X_{n}\right)\) converges in probability to the mean \(m=E[X]\) , in which \(X\) is a random variable obeying the common probability law of \(X_{1}, X_{2}, \ldots, X_{n}\) .

Proof

Define \(Y=X-E[X]\) and

\[Z_{n}=\frac{1}{n}\left(X_{1}+X_{2}+\cdots+X_{n}\right)-E[X].\]

To prove that the sample mean \((1 / n)\left(X_{1}+X_{2}+\cdots+X_{n}\right)\) converges in probability to the mean \(E[X]\) , it suffices to show that \(Z_{n}\) converges in distribution to 0. Now, for a given value of \(u\) and for \(n\) so large that \(n>6|u| E[|Y|]\)

\begin{align} \log \phi_{Z_{n}}(u) & = n \log \phi_{Y}\left(\frac{u}{n}\right) \tag{3.22} \\ & = n\left\{ i \frac{u}{n} \int_{0}^{1} dt\, E\left[Y\left(e^{i u t Y / n} - 1\right)\right] + 9 \theta \frac{u^{2}}{n^{2}} E\left[Y\right]^{2} \right\}, \end{align}

which tends to 0 as \(n\) tends to \(\infty\) , since, for each fixed \(t, u\) , and \(y, e^{i u t y / n}\) tends to 1 as \(n\) tends to \(\infty\) . The proof is complete.

Exercises

3.1. Prove lemma 3C.

3.2. Let \(X_{1}, X_{2}, \ldots, X_{n}\) be independent random variables, each assuming each of the values +1 and -1 with probability \(\frac{1}{2}\) . Let \(Y_{n}=\sum_{j=1}^{n} X_{j} / 2^{j}\) . Find the characteristic function of \(Y_{n}\) and show that, as \(n\) tends to \(\infty\) , for each \(u, \phi_{Y_{n}}(u)\) tends to the characteristic function of a random variable \(Y\) uniformly distributed over the interval -1 to 1. Consequently, evaluate \(P\left[-2, \(P\left[\frac{1}{4}approximately.

3.3. Let \(X_{1}, X_{2}, \ldots, X_{n}\) be independent random variables, identically distributed as the random variable \(X\) . For \(n=1,2, \ldots\) , let

\[Z_{n}=\frac{S_{n}-E\left[S_{n}\right]}{\sigma\left[S_{n}\right]}, \quad S_{n}=X_{1}+X_{2}+\cdots X_{n}.\]

Assuming that \(X\) is (i) binomial distributed with parameters \(n=6\) and \(p=\frac{1}{3}\) , (ii) Poisson distributed with parameter \(\lambda=2\) , (iii) \(\chi^{2}\) distributed with \(v=2\) degrees of freedom, for each real number \(u\) , show that \(\lim _{n \rightarrow \infty} \log\) \(\phi_{Z_{n}}(u)=-\frac{1}{2} u^{2}\) . Consequently, evaluate \(P\left[18 \leq S_{10} \leq 20\right]\) approximately.

3.4. For any integer \(r\) and \(0let \(N(r, p)\) denote the minimum number of trials required to obtain \(r\) successes in a sequence of independent repeated Bernoulli trials, in which the probability of success at each trial is \(p\) . Let \(Z\) be a random variable \(\chi^{2}\) distributed with \(2 r\) degrees of freedom. Show that, at each \(u, \lim_{p \rightarrow 0} \phi_{2 p N(r, p)}(u)=\phi_{Z}(u)\) . State in words the meaning of this result.

3.5. Let \(Z_{n}\) be binomial distributed with parameters \(n\) and \(p=\lambda / n\) , in which \(\lambda>0\) is a fixed constant. Let \(Z\) be Poisson distributed with parameter \(\lambda\) . For each \(u\) , show that \(\lim_{n \rightarrow \infty} \phi_{Z_{n}}(u)=\phi_{Z}(u)\) . State in words the meaning of this result.

3.6. Let \(Z\) be a random variable Poisson distributed with parameter \(\lambda\) . By use of characteristic functions, show that as \(\lambda\) tends to \(\infty\)

\[\mathscr{L}\left(\frac{Z-\lambda}{\sqrt{\lambda}}\right) \rightarrow \mathscr{L}(Y)\]

in which \(Y\) is normally distributed with mean 0 and variance 1.

3.7. Show that \(\operatorname{plim} _{n \rightarrow \infty} X_{n}=X\) implies that \(\lim _{n \rightarrow \infty} \mathscr{L}\left(X_{n}\right)=\mathscr{L}(X)\) .