More About Expectation

In this section we define the expectation of a function with respect to (i) a probability law specified by its distribution function, and (ii) a numerical n -tuple valued random phenomenon.

Stieltjes Integral . In section 2 we defined the expectation of a continuous function g ( x ) with respect to a probability law, which is specified by a probability mass function or by a probability density function. We now consider the case of a general probability law, which is specified by its distribution function F ( ) .

In order to define the expectation with respect to a probability law specified by a distribution function F ( ) , we require a generalization of the notion of integral, which goes under the name of the Stieltjes integral . Given a continuous function g ( x ) , a distribution function F ( ) , and a half open interval ( a , b ] on the real line (that is, ( a , b ] consists of all the points strictly greater than a and less than or equal to b ), we define the Stieltjes integral of g ( ) , with respect to F ( ) over ( a , b ] , written a + b g ( x ) d F ( x ) , as follows. We start with a partition of the interval ( a , b ] into n subintervals ( x i 1 , x i ] , in which x 0 , x 1 , , x n are ( n + 1 ) points chosen so that a = x 0 < x 1 < < x n = b . We then choose a set points x_{1}^{\prime}, x_{2}^{\prime}, \ldots x_{n}^{\prime} , one in each subinterval, so that x_{i-1} for i = 1 , 2 , , n , We define

\int_{a+}^{b} g(x) d F(x)=\underset{n \rightarrow \infty}{\operatorname{limit}} \sum_{i=1}^{n} g\left(x_{i}^{\prime}\right)\left[F\left(x_{i}\right)-F\left(x_{i-1}\right)\right] \tag{6.1} 

in which the limit is taken over all partitions of the interval ( a , b ] , as the maximum length of subinterval in the partition tends to 0.

It may be shown that if F ( ) is specified by a probability density function f ( ) , then

\int_{a^{+}}^{b} g(x) d F(x)=\int_{a}^{b} g(x) f(x) d x, \tag{6.2} 

whereas if F ( ) is specified by a probability mass function p ( ) then

\int_{a+}^{b} g(x) d F(x)=\sum_{\substack{\text { over alf } x \text { such that } \\ a0}} g(x) p(x). \tag{6.3} 

The Stieltjes integral of the continuous function g ( ) , with respect to the distribution function F ( ) over the whole real line, is defined by

\int_{-\infty}^{\infty} g(x) d F(x)=\lim _{\substack{a \rightarrow-\infty \\ b \rightarrow \infty}} \int_{a+}^{b} g(x) dF(x). \tag{6.4} 

The discussion in section 2 in regard to the existence and finiteness of integrals over the real line applies also to Stieltjes integrals. We say that g ( x ) d F ( x ) exists if and only if | g ( x ) | d F ( x ) is finite. Thus only absolutely convergent Stieltjes integrals are to be invested with sense.

We now define the expectation of a continuous function g ( ) , with respect to a probability law specified by a distribution function F ( ) , as the Stieltjes integral of g ( ) , with respect to F ( ) over the infinite real line; in symbols,

E[g(x)]=\int_{-\infty}^{\infty} g(x) dF(x). \tag{6.5} 

Stieltjes integrals are only of theoretical interest. They provide a compact way of defining, and working with, the properties of expectation. In practice, one evaluates a Stieltjes integral by breaking it up into a sum of an ordinary integral and an ordinary summation by means of the following theorem: if there exists a probability density function f ( ) , a probability mass function p ( ) , and constants c 1 and c 2 , whose sum is 1, such that for every x  

F(x)=c_{1} \int_{-\infty}^{x} f\left(x^{\prime}\right) d x^{\prime}+c_{2} \sum_{\substack{\text { over all } x^{\prime} \leq x \text { such } \\ \text { that } p\left(x^{\prime}\right)>0}} p\left(x^{\prime}\right), \tag{6.6} 

then for any continuous function g ( )  

\int_{-\infty}^{\infty} g(x) d F(x)=c_{1} \int_{-\infty}^{\infty} g(x) f(x) d x+c_{2} \sum_{\substack{\text { over all } x \text { such } \\ \text { that } p(x)>0}} g(x) p(x). \tag{6.7} 

In giving the proofs of various propositions about probability laws we most often confine ourselves to the case in which the probability law is specified by a probability density function, for here we may employ only ordinary integrals. However, the properties of Stieltjes integrals are very much the same as those of ordinary Riemann integrals; consequently, the proofs we give are immediately translatable into proofs of the general case that require the use of Stièltjes integrals.

Expectations with Respect to Numerical n -Tuple Valued Random Phenomena . The foregoing ideas extend immediately to a numerical n -tuple valued random phenomenon. Given the distribution function F ( x 1 , x 2 , , x n ) of such a random phenomenon and any continuous function g ( x 1 , , x n ) of n real variables, we define the expectation of the function with respect to the random phenomenon by

E\left[g\left(x_{1}, x_{2}, \ldots, x_{n}\right)\right] =\underset{R_n}{\iint \cdots \int} g\left(x_{1}, x_{2}, \ldots, x_{n}\right) d F\left(x_{1}, x_{2}, \ldots, x_{n}\right) \tag{6.8} 

in which the integral is a Stieltjes integral over the space R n of all n -tuples ( x 1 , x 2 , , x n ) of real numbers. We shall not write out here the definition of this integral.

We note that (6.2) and (6.3) generalize. If the distribution function F ( x 1 , x 2 , , x n ) is specified by a probability density function f ( x 1 , x 2 , , x n ) so that \left(7.7^{\prime}\right) of Chapter 4 holds, then

E[g(x_{1}, x_{2}, \ldots, x_{n})] = \underbrace{\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty}}_{\text{$n$ integrals}} g(x_{1}, x_{2}, \ldots, x_{n}) f(x_{1}, x_{2}, \ldots, x_{n}) \, dx_{1} \, dx_{2} \cdots \, dx_{n} \tag{6.9} 

If the distribution function F ( x 1 , x 2 , , x n ) is specified by a probability mass function p ( x 1 , x 2 , , x n ) , so that \left(7.8^{\prime}\right) of Chapter 4 holds, then

\begin{align} & E\left[g\left(x_{1}, x_{2}, \ldots, x_{n}\right)\right] \tag{6.10} \\ & =\sum_{\substack{\text { over all }\left(x_{1}, x_{2}, \ldots, x_{n}\right) \\ \text { such that } p\left(x_{1}, x_{2}, \ldots, x_{n}\right)>0}} g\left(x_{1}, x_{2}, \ldots, x_{n}\right) p\left(x_{1}, x_{2}, \ldots, x_{n}\right) . \end{align}

Exercises

6.1 . Compute the mean, variance, and moment-generating function of each of the probability laws specified by the following distribution functions. (Recall that [ x ] denotes the largest integer less than or equal to x .)

\begin{align} &\text{(i)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 0 \\[2mm] 1 - \frac{1}{3} e^{-(x / 3)} - \frac{2}{3} e^{- [x / 3]}, & \text{ for } x \geq 0. \end{cases} \\[2mm] &\text{(ii)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 0 \\[2mm] 8 \displaystyle\int_{0}^{x} y e^{-4y} , dy + \frac{e^{-2}}{2} \sum_{k=0}^{[x]} \frac{2^{k}}{k!}, & \text{ for } x \geq 0. \end{cases} \\[2mm] & \text{(iii)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 1 \\[2mm] 1 - \frac{1}{2x^{2}} - \frac{1}{2^{[x]}}, & \text{ for } x \geq 1. \end{cases}\\[2mm] & \text{(iv)} \quad F(x) = \begin{cases} 0, & \text{ for } x < 1 \\[2mm] 1 - \frac{2}{3x} - \frac{1}{3^{[x]}}, & \text{ for } x \geq 1. \end{cases} \end{align} 

 

Answer

(i) m.g.f., 1 3 1 1 3 t + 2 3 1 e 1 1 e 3 t 1 e 3 t ; (ii) m.g.f., 1 2 ( 1 t 4 ) 2 + 1 2 e 2 ( e t 1 ) ; (iii) mean s 2 , variance , m.g.f.does not exist.(iv) mean ; variance, m.g.f.does not exist.

 

 

6.2 . Compute the expectation of the function g ( x 1 , x 2 ) = x 1 x 2 with respect to the probability laws of the numerical 2-tuple valued random phenomenon specified by the following probability density functions or probability mass functions:

\begin{align} &\text{(i)} \quad f\left(x_{1}, x_{2}\right) = \exp \left(-2\left|x_{1}\right|-2\left|x_{2}\right|\right) \\[2mm] &\text{(ii)} \quad f\left(x_{1}, x_{2}\right) = \begin{cases} \frac{1}{\left(b_{1}-a_{1}\right)\left(b_{2}-a_{2}\right)}, & \text{if } a_{1} \leq x_{1} \leq b_{1} \text{ and } a_{2} \leq x_{2} \leq b_{2} \\[2mm] 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(iii)} \quad f\left(x_{1}, x_{2}\right) = \frac{1}{2 \pi \sqrt{1-\rho^{2}}} \exp \left[\frac{-(x_{1}^{2}+x_{2}^{2}+2 \rho x_{1} x_{2})}{2\left(1-\rho^{2}\right)}\right] \quad \text{in which } |\rho|<1. \\[2mm] &\text{(iv)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \frac{1}{36}, & \text{if } x_{1}=1, 2, 3, \ldots,6 \text{ and } x_{2}=1, 2, \ldots,6 \\[2mm] 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(v)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \binom{6}{x_{1}} \binom{6}{x_{2}} \left(\frac{2}{3}\right)^{x_{1}+x_{2}} \left(\frac{1}{3}\right)^{12-x_{1}-x_{2}}, & \text{for } x_{1} \text{ and }\; x_{2} \text{equal to nonnegative integers} \\ 0, & \text{otherwise.} \end{cases} \\[2mm] &\text{(vi)} \quad p\left(x_{1}, x_{2}\right) = \begin{cases} \binom{12}{x_{1}x_{2} 12-x_{1}-x_{2}} \left(\frac{1}{4}\right)^{x_{1}} \left(\frac{1}{3}\right)^{x_{2}} \left(\frac{5}{12}\right)^{12-x_{1}-x_{2}}, & \text{for } x_{1} \text{ and } x_{2} \text{ equal to nonnegative integers} \\ 0, & \text{otherwise.} \end{cases} \end{align}