In this section we define the expectation of a function with respect to (i) a probability law specified by its distribution function, and (ii) a numerical \(n\) -tuple valued random phenomenon.
Stieltjes Integral . In section 2 we defined the expectation of a continuous function \(g(x)\) with respect to a probability law, which is specified by a probability mass function or by a probability density function. We now consider the case of a general probability law, which is specified by its distribution function \(F(\cdot)\) .
In order to define the expectation with respect to a probability law specified by a distribution function \(F(\cdot)\) , we require a generalization of the notion of integral, which goes under the name of the Stieltjes integral . Given a continuous function \(g(x)\) , a distribution function \(F(\cdot)\) , and a half open interval \((a, b]\) on the real line (that is, \((a, b]\) consists of all the points strictly greater than \(a\) and less than or equal to \(b\) ), we define the Stieltjes integral of \(g(\cdot)\) , with respect to \(F(\cdot)\) over \((a, b]\) , written \(\int_{a+}^{b} g(x) d F(x)\) , as follows. We start with a partition of the interval \((a, b]\) into \(n\) subintervals \(\left(x_{i-1}, x_{i}\right]\) , in which \(x_{0}, x_{1}, \ldots, x_{n}\) are \((n+1)\) points chosen so that \(a=x_{0}
\[\int_{a+}^{b} g(x) d F(x)=\underset{n \rightarrow \infty}{\operatorname{limit}} \sum_{i=1}^{n} g\left(x_{i}^{\prime}\right)\left[F\left(x_{i}\right)-F\left(x_{i-1}\right)\right] \tag{6.1}\]
in which the limit is taken over all partitions of the interval \((a, b]\) , as the maximum length of subinterval in the partition tends to 0.
It may be shown that if \(F(\cdot)\) is specified by a probability density function \(f(\cdot)\) , then
\[\int_{a^{+}}^{b} g(x) d F(x)=\int_{a}^{b} g(x) f(x) d x, \tag{6.2}\]
whereas if \(F(\cdot)\) is specified by a probability mass function \(p(\cdot)\) then
\[\int_{a+}^{b} g(x) d F(x)=\sum_{\substack{\text { over alf } x \text { such that } \\ a
The Stieltjes integral of the continuous function \(g(\cdot)\) , with respect to the distribution function \(F(\cdot)\) over the whole real line, is defined by
\[\int_{-\infty}^{\infty} g(x) d F(x)=\lim _{\substack{a \rightarrow-\infty \\ b \rightarrow \infty}} \int_{a+}^{b} g(x) dF(x). \tag{6.4}\]
The discussion in section 2 in regard to the existence and finiteness of integrals over the real line applies also to Stieltjes integrals. We say that \(\int_{-\infty}^{\infty} g(x) d F(x)\) exists if and only if \(\int_{-\infty}^{\infty}|g(x)| d F(x)\) is finite. Thus only absolutely convergent Stieltjes integrals are to be invested with sense.
We now define the expectation of a continuous function \(g(\cdot)\) , with respect to a probability law specified by a distribution function \(F(\cdot)\) , as the Stieltjes integral of \(g(\cdot)\) , with respect to \(F(\cdot)\) over the infinite real line; in symbols,
\[E[g(x)]=\int_{-\infty}^{\infty} g(x) dF(x). \tag{6.5}\]
Stieltjes integrals are only of theoretical interest. They provide a compact way of defining, and working with, the properties of expectation. In practice, one evaluates a Stieltjes integral by breaking it up into a sum of an ordinary integral and an ordinary summation by means of the following theorem: if there exists a probability density function \(f(\cdot)\) , a probability mass function \(p(\cdot)\) , and constants \(c_{1}\) and \(c_{2}\) , whose sum is 1, such that for every \(x\)
\[F(x)=c_{1} \int_{-\infty}^{x} f\left(x^{\prime}\right) d x^{\prime}+c_{2} \sum_{\substack{\text { over all } x^{\prime} \leq x \text { such } \\ \text { that } p\left(x^{\prime}\right)>0}} p\left(x^{\prime}\right), \tag{6.6}\]
then for any continuous function \(g(\cdot)\)
\[\int_{-\infty}^{\infty} g(x) d F(x)=c_{1} \int_{-\infty}^{\infty} g(x) f(x) d x+c_{2} \sum_{\substack{\text { over all } x \text { such } \\ \text { that } p(x)>0}} g(x) p(x). \tag{6.7}\]
In giving the proofs of various propositions about probability laws we most often confine ourselves to the case in which the probability law is specified by a probability density function, for here we may employ only ordinary integrals. However, the properties of Stieltjes integrals are very much the same as those of ordinary Riemann integrals; consequently, the proofs we give are immediately translatable into proofs of the general case that require the use of Stièltjes integrals.
Expectations with Respect to Numerical \(n\) -Tuple Valued Random Phenomena . The foregoing ideas extend immediately to a numerical \(n\) -tuple valued random phenomenon. Given the distribution function \(F\left(x_{1}, x_{2}, \ldots, x_{n}\right)\) of such a random phenomenon and any continuous function \(g\left(x_{1}, \ldots, x_{n}\right)\) of \(n\) real variables, we define the expectation of the function with respect to the random phenomenon by
\[E\left[g\left(x_{1}, x_{2}, \ldots, x_{n}\right)\right] =\underset{R_n}{\iint \cdots \int} g\left(x_{1}, x_{2}, \ldots, x_{n}\right) d F\left(x_{1}, x_{2}, \ldots, x_{n}\right) \tag{6.8}\]
in which the integral is a Stieltjes integral over the space \(R_{n}\) of all \(n\) -tuples \(\left(x_{1}, x_{2}, \ldots, x_{n}\right)\) of real numbers. We shall not write out here the definition of this integral.
We note that (6.2) and (6.3) generalize. If the distribution function \(F\left(x_{1}, x_{2}, \ldots, x_{n}\right)\) is specified by a probability density function \(f\left(x_{1}, x_{2}, \ldots, x_{n}\right)\) so that \(\left(7.7^{\prime}\right)\) of Chapter 4 holds, then
\[E[g(x_{1}, x_{2}, \ldots, x_{n})] = \underbrace{\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty}}_{\text{$n$ integrals}} g(x_{1}, x_{2}, \ldots, x_{n}) f(x_{1}, x_{2}, \ldots, x_{n}) \, dx_{1} \, dx_{2} \cdots \, dx_{n} \tag{6.9}\]
If the distribution function \(F\left(x_{1}, x_{2}, \ldots, x_{n}\right)\) is specified by a probability mass function \(p\left(x_{1}, x_{2}, \ldots, x_{n}\right)\) , so that \(\left(7.8^{\prime}\right)\) of Chapter 4 holds, then
\begin{align} & E\left[g\left(x_{1}, x_{2}, \ldots, x_{n}\right)\right] \tag{6.10} \\ & =\sum_{\substack{\text { over all }\left(x_{1}, x_{2}, \ldots, x_{n}\right) \\ \text { such that } p\left(x_{1}, x_{2}, \ldots, x_{n}\right)>0}} g\left(x_{1}, x_{2}, \ldots, x_{n}\right) p\left(x_{1}, x_{2}, \ldots, x_{n}\right) . \end{align}
Exercises
6.1 . Compute the mean, variance, and moment-generating function of each of the probability laws specified by the following distribution functions. (Recall that \([x]\) denotes the largest integer less than or equal to \(x\) .)
\begin{align} &\text{(i)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 0 \\[2mm] 1 - \frac{1}{3} e^{-(x / 3)} - \frac{2}{3} e^{- [x / 3]}, & \text{ for } x \geq 0. \end{cases} \\[2mm] &\text{(ii)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 0 \\[2mm] 8 \displaystyle\int_{0}^{x} y e^{-4y} , dy + \frac{e^{-2}}{2} \sum_{k=0}^{[x]} \frac{2^{k}}{k!}, & \text{ for } x \geq 0. \end{cases} \\[2mm] & \text{(iii)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 1 \\[2mm] 1 - \frac{1}{2x^{2}} - \frac{1}{2^{[x]}}, & \text{ for } x \geq 1. \end{cases}\\[2mm] & \text{(iv)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 1 \\[2mm] 1 - \frac{2}{3x} - \frac{1}{3^{[x]}}, & \text{ for } x \geq 1. \end{cases} \end{align}
Answer
(i) m.g.f., \(\frac{1}{3} \frac{1}{1-3 t}+\frac{2}{3} \frac{1-e^{-1}}{1-e^{3 t-1}} e^{3 t}\) ; (ii) m.g.f., \(\frac{1}{2}\left(1-\frac{t}{4}\right)^{-2}+\frac{1}{2} e^{2\left(e^{t}-1\right)}\) ; (iii) mean \(\frac{s}{2}\) , variance \(\infty\) , m.g.f.does not exist.(iv) mean \(\infty\) ; variance, m.g.f.does not exist.
6.2 . Compute the expectation of the function \(g\left(x_{1}, x_{2}\right)=x_{1} x_{2}\) with respect to the probability laws of the numerical 2-tuple valued random phenomenon specified by the following probability density functions or probability mass functions:
\begin{align} &\text{(i)} \quad f\left(x_{1}, x_{2}\right) = \exp \left(-2\left|x_{1}\right|-2\left|x_{2}\right|\right) \\[2mm] &\text{(ii)} \quad f\left(x_{1}, x_{2}\right) = \begin{cases} \frac{1}{\left(b_{1}-a_{1}\right)\left(b_{2}-a_{2}\right)}, & \text{if } a_{1} \leq x_{1} \leq b_{1} \text{ and } a_{2} \leq x_{2} \leq b_{2} \\[2mm] 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(iii)} \quad f\left(x_{1}, x_{2}\right) = \frac{1}{2 \pi \sqrt{1-\rho^{2}}} \exp \left[\frac{-(x_{1}^{2}+x_{2}^{2}+2 \rho x_{1} x_{2})}{2\left(1-\rho^{2}\right)}\right] \quad \text{in which } |\rho|<1. \\[2mm] &\text{(iv)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \frac{1}{36}, & \text{if } x_{1}=1, 2, 3, \ldots,6 \text{ and } x_{2}=1, 2, \ldots,6 \\[2mm] 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(v)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \binom{6}{x_{1}} \binom{6}{x_{2}} \left(\frac{2}{3}\right)^{x_{1}+x_{2}} \left(\frac{1}{3}\right)^{12-x_{1}-x_{2}}, & \text{for } x_{1} \text{ and }\; x_{2} \text{equal to nonnegative integers} \\ 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(vi)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \binom{12}{x_{1}x_{2} 12-x_{1}-x_{2}} \left(\frac{1}{4}\right)^{x_{1}} \left(\frac{1}{3}\right)^{x_{2}} \left(\frac{5}{12}\right)^{12-x_{1}-x_{2}}, & \text{for } x_{1} \text{ and } x_{2} \text{ equal to nonnegative integers} \\ 0, & \text{otherwise.} \end{cases} \end{align}