Probability Theory and Its Applications

In this section we define the expectation of a function with respect to (i) a probability law specified by its distribution function, and (ii) a numerical $n$ -tuple valued random phenomenon.

Stieltjes Integral . In section 2 we defined the expectation of a continuous function $g(x)$ with respect to a probability law, which is specified by a probability mass function or by a probability density function. We now consider the case of a general probability law, which is specified by its distribution function $F(\cdot)$ .

In order to define the expectation with respect to a probability law specified by a distribution function $F(\cdot)$ , we require a generalization of the notion of integral, which goes under the name of the Stieltjes integral . Given a continuous function $g(x)$ , a distribution function $F(\cdot)$ , and a half open interval $(a, b]$ on the real line (that is, $(a, b]$ consists of all the points strictly greater than $a$ and less than or equal to $b$ ), we define the Stieltjes integral of $g(\cdot)$ , with respect to $F(\cdot)$ over $(a, b]$ , written $\int_{a+}^{b} g(x) d F(x)$ , as follows. We start with a partition of the interval $(a, b]$ into $n$ subintervals $\left(x_{i-1}, x_{i}\right]$ , in which $x_{0}, x_{1}, \ldots, x_{n}$ are $(n+1)$ points chosen so that $a=x_{0}. We then choose a set points \(x_{1}^{\prime}, x_{2}^{\prime}, \ldots$ $x_{n}^{\prime}$ , one in each subinterval, so that $x_{i-1}for \(i=1,2, \ldots, n$ , We define

\[\int_{a+}^{b} g(x) d F(x)=\underset{n \rightarrow \infty}{\operatorname{limit}} \sum_{i=1}^{n} g\left(x_{i}^{\prime}\right)\left[F\left(x_{i}\right)-F\left(x_{i-1}\right)\right] \tag{6.1}\]

in which the limit is taken over all partitions of the interval $(a, b]$ , as the maximum length of subinterval in the partition tends to 0.

It may be shown that if $F(\cdot)$ is specified by a probability density function $f(\cdot)$ , then

\[\int_{a^{+}}^{b} g(x) d F(x)=\int_{a}^{b} g(x) f(x) d x, \tag{6.2}\]

whereas if $F(\cdot)$ is specified by a probability mass function $p(\cdot)$ then

\[\int_{a+}^{b} g(x) d F(x)=\sum_{\substack{\text { over alf } x \text { such that } \\ a0}} g(x) p(x). \tag{6.3}\]

The Stieltjes integral of the continuous function $g(\cdot)$ , with respect to the distribution function $F(\cdot)$ over the whole real line, is defined by

\[\int_{-\infty}^{\infty} g(x) d F(x)=\lim _{\substack{a \rightarrow-\infty \\ b \rightarrow \infty}} \int_{a+}^{b} g(x) dF(x). \tag{6.4}\]

The discussion in section 2 in regard to the existence and finiteness of integrals over the real line applies also to Stieltjes integrals. We say that $\int_{-\infty}^{\infty} g(x) d F(x)$ exists if and only if $\int_{-\infty}^{\infty}|g(x)| d F(x)$ is finite. Thus only absolutely convergent Stieltjes integrals are to be invested with sense.

We now define the expectation of a continuous function $g(\cdot)$ , with respect to a probability law specified by a distribution function $F(\cdot)$ , as the Stieltjes integral of $g(\cdot)$ , with respect to $F(\cdot)$ over the infinite real line; in symbols,

\[E[g(x)]=\int_{-\infty}^{\infty} g(x) dF(x). \tag{6.5}\]

Stieltjes integrals are only of theoretical interest. They provide a compact way of defining, and working with, the properties of expectation. In practice, one evaluates a Stieltjes integral by breaking it up into a sum of an ordinary integral and an ordinary summation by means of the following theorem: if there exists a probability density function $f(\cdot)$ , a probability mass function $p(\cdot)$ , and constants $c_{1}$ and $c_{2}$ , whose sum is 1, such that for every $x$

\[F(x)=c_{1} \int_{-\infty}^{x} f\left(x^{\prime}\right) d x^{\prime}+c_{2} \sum_{\substack{\text { over all } x^{\prime} \leq x \text { such } \\ \text { that } p\left(x^{\prime}\right)>0}} p\left(x^{\prime}\right), \tag{6.6}\]

then for any continuous function $g(\cdot)$

\[\int_{-\infty}^{\infty} g(x) d F(x)=c_{1} \int_{-\infty}^{\infty} g(x) f(x) d x+c_{2} \sum_{\substack{\text { over all } x \text { such } \\ \text { that } p(x)>0}} g(x) p(x). \tag{6.7}\]

In giving the proofs of various propositions about probability laws we most often confine ourselves to the case in which the probability law is specified by a probability density function, for here we may employ only ordinary integrals. However, the properties of Stieltjes integrals are very much the same as those of ordinary Riemann integrals; consequently, the proofs we give are immediately translatable into proofs of the general case that require the use of Stièltjes integrals.

Expectations with Respect to Numerical $n$ -Tuple Valued Random Phenomena . The foregoing ideas extend immediately to a numerical $n$ -tuple valued random phenomenon. Given the distribution function $F\left(x_{1}, x_{2}, \ldots, x_{n}\right)$ of such a random phenomenon and any continuous function $g\left(x_{1}, \ldots, x_{n}\right)$ of $n$ real variables, we define the expectation of the function with respect to the random phenomenon by

\[E\left[g\left(x_{1}, x_{2}, \ldots, x_{n}\right)\right] =\underset{R_n}{\iint \cdots \int} g\left(x_{1}, x_{2}, \ldots, x_{n}\right) d F\left(x_{1}, x_{2}, \ldots, x_{n}\right) \tag{6.8}\]

in which the integral is a Stieltjes integral over the space $R_{n}$ of all $n$ -tuples $\left(x_{1}, x_{2}, \ldots, x_{n}\right)$ of real numbers. We shall not write out here the definition of this integral.

We note that (6.2) and (6.3) generalize. If the distribution function $F\left(x_{1}, x_{2}, \ldots, x_{n}\right)$ is specified by a probability density function $f\left(x_{1}, x_{2}, \ldots, x_{n}\right)$ so that $\left(7.7^{\prime}\right)$ of Chapter 4 holds, then

\[E[g(x_{1}, x_{2}, \ldots, x_{n})] = \underbrace{\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty}}_{\text{$n$ integrals}} g(x_{1}, x_{2}, \ldots, x_{n}) f(x_{1}, x_{2}, \ldots, x_{n}) \, dx_{1} \, dx_{2} \cdots \, dx_{n} \tag{6.9}\]

If the distribution function $F\left(x_{1}, x_{2}, \ldots, x_{n}\right)$ is specified by a probability mass function $p\left(x_{1}, x_{2}, \ldots, x_{n}\right)$ , so that $\left(7.8^{\prime}\right)$ of Chapter 4 holds, then

\begin{align} & E\left[g\left(x_{1}, x_{2}, \ldots, x_{n}\right)\right] \tag{6.10} \\ & =\sum_{\substack{\text { over all }\left(x_{1}, x_{2}, \ldots, x_{n}\right) \\ \text { such that } p\left(x_{1}, x_{2}, \ldots, x_{n}\right)>0}} g\left(x_{1}, x_{2}, \ldots, x_{n}\right) p\left(x_{1}, x_{2}, \ldots, x_{n}\right) . \end{align}

Exercises

6.1 . Compute the mean, variance, and moment-generating function of each of the probability laws specified by the following distribution functions. (Recall that $[x]$ denotes the largest integer less than or equal to $x$ .)

\begin{align} &\text{(i)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 0 \\[2mm] 1 - \frac{1}{3} e^{-(x / 3)} - \frac{2}{3} e^{- [x / 3]}, & \text{ for } x \geq 0. \end{cases} \\[2mm] &\text{(ii)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 0 \\[2mm] 8 \displaystyle\int_{0}^{x} y e^{-4y} , dy + \frac{e^{-2}}{2} \sum_{k=0}^{[x]} \frac{2^{k}}{k!}, & \text{ for } x \geq 0. \end{cases} \\[2mm] & \text{(iii)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 1 \\[2mm] 1 - \frac{1}{2x^{2}} - \frac{1}{2^{[x]}}, & \text{ for } x \geq 1. \end{cases}\\[2mm] & \text{(iv)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 1 \\[2mm] 1 - \frac{2}{3x} - \frac{1}{3^{[x]}}, & \text{ for } x \geq 1. \end{cases} \end{align}

Answer

(i) m.g.f., $\frac{1}{3} \frac{1}{1-3 t}+\frac{2}{3} \frac{1-e^{-1}}{1-e^{3 t-1}} e^{3 t}$ ; (ii) m.g.f., $\frac{1}{2}\left(1-\frac{t}{4}\right)^{-2}+\frac{1}{2} e^{2\left(e^{t}-1\right)}$ ; (iii) mean $\frac{s}{2}$ , variance $\infty$ , m.g.f.does not exist.(iv) mean $\infty$ ; variance, m.g.f.does not exist.

6.2 . Compute the expectation of the function $g\left(x_{1}, x_{2}\right)=x_{1} x_{2}$ with respect to the probability laws of the numerical 2-tuple valued random phenomenon specified by the following probability density functions or probability mass functions:

\begin{align} &\text{(i)} \quad f\left(x_{1}, x_{2}\right) = \exp \left(-2\left|x_{1}\right|-2\left|x_{2}\right|\right) \\[2mm] &\text{(ii)} \quad f\left(x_{1}, x_{2}\right) = \begin{cases} \frac{1}{\left(b_{1}-a_{1}\right)\left(b_{2}-a_{2}\right)}, & \text{if } a_{1} \leq x_{1} \leq b_{1} \text{ and } a_{2} \leq x_{2} \leq b_{2} \\[2mm] 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(iii)} \quad f\left(x_{1}, x_{2}\right) = \frac{1}{2 \pi \sqrt{1-\rho^{2}}} \exp \left[\frac{-(x_{1}^{2}+x_{2}^{2}+2 \rho x_{1} x_{2})}{2\left(1-\rho^{2}\right)}\right] \quad \text{in which } |\rho|<1. \\[2mm] &\text{(iv)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \frac{1}{36}, & \text{if } x_{1}=1, 2, 3, \ldots,6 \text{ and } x_{2}=1, 2, \ldots,6 \\[2mm] 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(v)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \binom{6}{x_{1}} \binom{6}{x_{2}} \left(\frac{2}{3}\right)^{x_{1}+x_{2}} \left(\frac{1}{3}\right)^{12-x_{1}-x_{2}}, & \text{for } x_{1} \text{ and }\; x_{2} \text{equal to nonnegative integers} \\ 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(vi)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \binom{12}{x_{1}x_{2} 12-x_{1}-x_{2}} \left(\frac{1}{4}\right)^{x_{1}} \left(\frac{1}{3}\right)^{x_{2}} \left(\frac{5}{12}\right)^{12-x_{1}-x_{2}}, & \text{for } x_{1} \text{ and } x_{2} \text{ equal to nonnegative integers} \\ 0, & \text{otherwise.} \end{cases} \end{align}