Expectations of Jointly Distributed Random Variables

Consider two jointly distributed random variables X1 and X2 . The expectation EX1,X2[g(x1,x2)] of a function g(x1,x2) of two real variables is defined as follows:

If the random variables X 1 and X 2 are jointly continuous, with joint probability density function f X 1 , X 2 ( x 1 , x 2 ) , then

E_{X_{1}, X_{2}}\left[g\left(x_{1}, x_{2}\right)\right]=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} g\left(x_{1}, x_{2}\right) f_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right) d x_{1} d x_{2}. \tag{2.1} 

If the random variables X 1 and X 2 are jointly discrete, with joint probability mass function p X 1 , X 2 ( x 1 , x 2 ) , then

E_{X_{1}, X_{2}}\left[g\left(x_{1}, x_{2}\right)\right]= \displaystyle\sum_{\substack{\text { over all }\left(x_{1}, x_{2}\right) \text { such } \\ \text { that } p_{X_{1}}, x_{2}\left(x_{1}, x_{2}\right)>0}} g\left(x_{1}, x_{2}\right) p_{X_{1}, x_{2}}\left(x_{1}, x_{2}\right). \tag{2.2} 

If the random variables X 1 and X 2 have joint distribution function F X 1 , X 2 ( x 1 , x 2 ) , then

E_{X_{1}, X_{2}}\left[g\left(x_{1}, x_{2}\right)\right]=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} g\left(x_{1}, x_{2}\right) d F_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right), \tag{2.3} 

where the two-dimensional Stieltjes integral may be defined in a manner similar to that in which the one-dimensional Stieltjes integral was defined in section 6 of Chapter 5.

On the other hand, g ( X 1 , X 2 ) is a random variable, with expectation

\begin{align} E[g(X_{1}, X_{2})] =\begin{cases} \displaystyle\int_{-\infty}^{\infty} y , dF_{g(X_{1}, X_{2})}(y), \tag{2.4} \\[4mm] \displaystyle\int_{-\infty}^{\infty} y f_{g(X_{1}, X_{2})}(y) , dy, \\[4mm] \displaystyle\sum_{\substack{\text{over all points} \; y \\ \text{where} \; p_{g(X_{1}, X_{2})}(y) > 0}} y , p_{g(X_{1}, X_{2})}(y). \end{cases} \end{align} 

depending on whether the probability law of g ( X 1 , X 2 ) is specified by its distribution function, probability density function, or probability mass function.

It is a basic fact of probability theory that for any jointly distributed random variables X 1 and X 2 and any Borel function g ( x 1 , x 2 )  

E\left[g\left(X_{1}, X_{2}\right)\right]=E_{X_{1}, X_{2}}\left[g\left(x_{1}, x_{2}\right)\right], \tag{2.5} 

in the sense that if either of the expectations in (2.5) exists then so does the other, and the two are equal. A rigorous proof of (2.5) is beyond the scope of this book.

In view of (2.5) we have two ways of computing the expectation of a function of jointly distributed random variables. Equation (2.5) generalizes (1.5). Similarly, (1.11) may also be generalized.

Let X 1 , X 2 , and Y be random variables such that Y = g 1 ( X 1 , X 2 ) for some Borel function g 1 ( x 1 , x 2 ) . Then for any Borel function g ( ) E[g(Y)]=E\left[g\left(g_{1}\left(X_{1}, X_{2}\right)\right)\right]. \tag{2.6} 

The most important property possessed by the operation of expectation of a random variable is its linearity property : if X 1 and X 2 are jointly distributed random variables with finite expectations E [ X 1 ] and E [ X 2 ] , then the sum X 1 + X 2 has a finite expectation given by E\left[X_{1}+X_{2}\right]=E\left[X_{1}\right]+E\left[X_{2}\right]. \tag{2.7} Let us sketch a proof of (2.7) in the case that X 1 and X 2 are jointly continuous. The reader may gain some idea of how (2.7) is proved in general by consulting the proof of (6.22) in Chapter 2.

From (2.5) it follows that

E[X_{1} + X_{2}] = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} (x_{1} + x_{2}) f_{X_{1}, X_{2}}(x_{1}, x_{2}) , dx_{1} , dx_{2}. \tag{$2.7^{\prime}$} 

Now

\begin{align} & \int_{-\infty}^{\infty} d x_{1} x_{1} \int_{-\infty}^{\infty} d x_{2} f_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right)=\int_{-\infty}^{\infty} dx_{1}\; x_{1} f_{X_{1}}\left(x_{1}\right)=E\left[X_{1}\right] \\ & \int_{-\infty}^{\infty} d x_{2} x_{2} \int_{-\infty}^{\infty} d x_{1} f_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right)=\int_{-\infty}^{\infty} d x_{2}\;x_{2} f_{X_{2}}\left(x_{2}\right)=E\left[X_{2}\right]. \tag{___MATH_BLOCK_200___} \end{align} 

The integral on the right-hand side of \left(2.7^{\prime}\right) is equal to the sum of the integrals on the left-hand sides of \left(2.7^{\prime\prime}\right) . The proof of (2.7) is now complete.

The moments and moment-generating function of jointly distributed random variables are defined by a direct generalization of the definitions given for a single random variable. For any two nonnegative integers n 1 and n 2 we define \alpha_{n_{1}, n_{2}}=E\left[X_{1}^{n_{1}} X_{2}^{n_{2}}\right] \tag{2.8} as a moment of the jointly distributed random variables X 1 and X 2 . The sum n 1 + n 2 is called the order of the moment. For the moments of orders 1 and 2 we have the following names; α 1 , 0 and α 0 , 1 are, respectively, the means of X 1 and X 2 , whereas α 2 , 0 and α 0 , 2 are, respectively, the mean squares of X 1 and X 2 . The moment α 11 = E [ X 1 X 2 ] is called the product moment .

We next define the central moments of the random variables X 1 and X 2 . For any two nonnegative integers, n 1 and n 2 , we define \mu_{n_{1}, n_{2}}=E\left[\left(X_{1}-E\left[X_{1}\right]\right)^{n_{1}}\left(X_{2}-E\left[X_{2}\right]\right)^{n_{2}}\right] \tag{$2.8^{\prime}$} as a central moment of order n 1 + n 2 . We are again particularly interested in the central moments of orders 1 and 2. The central moments μ 1 , 0 and μ 0 , 1 of order 1 both vanish, whereas μ 2 , 0 and μ 0 , 2 are, respectively, the variances of X 1 and X 2 . The central moment μ 1 , 1 is called the covariance of the random variables X 1 and X 2 and is written Cov [ X 1 , X 2 ] ; in symbols, \operatorname{Cov}\left[X_{1}, X_{2}\right]=\mu_{1,1}=E\left[\left(X_{1}-E\left[X_{1}\right]\right)\left(X_{2}-E\left[X_{2}\right]\right)\right]. \tag{2.9} We leave it to the reader to prove that the covariance is equal to the product moment, minus the product of the means ; in symbols, \operatorname{Cov}\left[X_{1}, X_{2}\right]=E\left[X_{1} X_{2}\right]-E\left[X_{1}\right] E\left[X_{2}\right]. \tag{2.10} 

The covariance derives its importance from the role it plays in the basic formula for the variance of the sum of two random variables : \operatorname{Var}\left[X_{1}+X_{2}\right]=\operatorname{Var}\left[X_{1}\right]+\operatorname{Var}\left[X_{2}\right]+2 \operatorname{Cov}\left[X_{1}, X_{2}\right] \tag{2.11} 

To prove (2.11), we write \begin{align} \operatorname{Var}\left[X_{1}+X_{2}\right]= & E\left[\left(X_{1}+X_{2}\right)^{2}\right]-E^{2}\left[X_{1}+X_{2}\right] \\[2mm] = & E\left[X_{1}^{2}\right]-E^{2}\left[X_{1}\right]+E\left[X_{2}^{2}\right]-E^{2}\left[X_{2}\right] \\[2mm] & \quad\quad\quad +2\left(E\left[X_{1} X_{2}\right]-E\left[X_{1}\right] E\left[X_{2}\right]\right), \end{align} from which (2.11) follows by (1.8) and (2.10).

The joint moment-generating function is defined for any two real numbers, t 1 and t 2 , by ψ X 1 , X 2 ( t 1 , t 2 ) = E [ e ( t 1 X 1 + t 2 X 2 ) ] .  

The moments can be read off from the power-series expansion of the moment-generating function, since formally \psi_{X_{1}, X_{2}}\left(t_{1}, t_{2}\right)=\sum_{n_{1}=0}^{\infty} \sum_{n_{2}=0}^{\infty} \frac{t_{1}^{n_{1}}}{n_{1} !} \frac{t_{2}^{n_{2}}}{n_{2} !} E\left[X_{1}^{n_{1}} X_{2}^{n_{2}}\right]. \tag{2.12} 

In particular, the means, variances, and covariance of X 1 and X 2 may be expressed in terms of the derivatives of the moment-generating function: \begin{align} E[X_1] &= \frac{\partial}{\partial t_1} \psi_{X_1, X_2}(0, 0), & E[X_2] &= \frac{\partial}{\partial t_2} \psi_{X_1, X_2}(0, 0), \tag{2.13} \\[4mm] E[X_1^2] &= \frac{\partial^2}{\partial t_1^2} \psi_{X_1, X_2}(0, 0), & E[X_2^2] &= \frac{\partial^2}{\partial t_2^2} \psi_{X_1, X_2}(0, 0). \tag{2.14} \\[4mm] E[X_1 X_2] &= \frac{\partial^2}{\partial t_1 \partial t_2} \psi_{X_1, X_2}(0, 0). \tag{2.15} \\[4mm] \text{Var}[X_1] &= \frac{\partial^2}{\partial t_1^2} \psi_{X_1 - m_1, X_2 - m_2}(0, 0), & \text{Var}[X_2] &= \frac{\partial^2}{\partial t_2^2} \psi_{X_1 - m_1, X_2 - m_2}(0, 0). \tag{2.16} \\[4mm] \text{Cov}[X_1, X_2] &= \frac{\partial^2}{\partial t_1 \partial t_2} \psi_{X_1 - m_1, X_2 - m_2}(0, 0). \tag{2.17} \end{align} in which m 1 = E [ X 1 ] , m 2 = E [ X 2 ] .

Example 1.

Example 2A . The joint moment-generating function and covariance of jointly normal random variables . Let X 1 and X 2 be jointly normally distributed random variables with a joint probability density function

\begin{align} f_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right)= & \frac{1}{2 \pi \sigma_{1} \sigma_{2} \sqrt{1-\rho^{2}}} \exp \left{-\frac{1}{2\left(1-\rho^{2}\right)}\left[\left(\frac{x_{1}-m_{1}}{\sigma_{1}}\right)^{2}\right.\right. \tag{2.18}\ & \left.\left.-2 \rho\left(\frac{x_{1}-m_{1}}{\sigma_{1}}\right)\left(\frac{x_{2}-m_{2}}{\sigma_{2}}\right)+\left(\frac{x_{2}-m_{2}}{\sigma_{2}}\right)^{2}\right]\right} \end{align}

The joint moment-generating function is given by

\psi_{X_{1}, X_{2}}\left(t_{1}, t_{2}\right)=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} e^{\left(t_{1} x_{1}+t_{2} x_{2}\right)} f_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right) d x_{1} d x_{2}. \tag{2.19}

 

To evaluate the integral in (2.19), let us note that since

u 1 2 2 ρ u 1 u 2 + u 2 2 = ( 1 ρ 2 ) u 1 2 + ( u 2 ρ u 1 ) 2

we may write \begin{align} f_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right)& =\frac{1}{\sigma_{1}} \phi\left(\frac{x_{1}-m_{1}}{\sigma_{1}}\right) \frac{1}{\sigma_{2} \sqrt{1-\rho^{2}}} \tag{2.20}\ & \times \phi\left(\frac{x_{2}-m_{2}-\left(\sigma_{2} / \sigma_{1}\right) \rho\left(x_{1}-m_{1}\right)}{\sigma_{2} \sqrt{1-\rho^{2}}}\right), \end{align}

in which ϕ ( u ) = 1 2 π e 1 / 2 u 2 is the normal density function. Using our knowledge of the moment-generating function of a normal law, we may perform the integration with respect to the variable x 2 in the integral in (2.19). We thus determine that ψ X 1 , X 2 ( t 1 , t 2 ) is equal to

\begin{align} \int_{-\infty}^{\infty} dx_{1} \frac{1}{\sigma_{1}} \phi\left(\frac{x_{1}-m_{1}}{\sigma_{1}}\right) & \exp \left(t_{1} x_{1}\right) \exp {\left.t_{2}\left[m_{2}+\frac{\sigma_{2}}{\sigma_{1}} \rho\left(x_{1}-m_{1}\right)\right]\right} \tag{2.21}\ & \times \exp \left[\frac{1}{2} t_{2}^{2} \sigma_{2}^{2}\left(1-\rho^{2}\right)\right] \ &=\exp {\left[\frac{1}{2} t_{2}^{2} \sigma_{2}^{2}\left(1-\rho^{2}\right)+t_{2} m_{2}-t_{2} \frac{\sigma_{2}}{\sigma_{1}} \rho m_{1}\right] } \ & \times \exp \left[m_{1}\left(t_{1}+t_{2} \frac{\sigma_{2}}{\sigma_{1}} \rho\right)+\frac{1}{2} \sigma_{1}^{2}\left(t_{1}+t_{2} \frac{\sigma_{2}}{\sigma_{1}} \rho\right)^{2}\right]. \end{align}

By combining terms in (2.21), we finally obtain that

(2.22) ψ X 1 , X 2 ( t 1 , t 2 ) = exp [ t 1 m 1 + t 2 m 2 + 1 2 ( t 1 2 σ 1 2 + 2 ρ σ 1 σ 2 t 1 t 2 + t 2 2 σ 2 2 ) ] .

The covariance is given by

\operatorname{Cov}\left[X_{1}, X_{2}\right]=\left.\frac{\partial}{\partial t_{1} \partial t_{2}} e^{-\left(t_{1} m_{1}+t_{2} m_{2}\right)} \psi_{X_{1}, X_{2}}\left(t_{1}, t_{2}\right)\right|_{t_{1}=0, t_{2}=0}=\rho \sigma_{1} \sigma_{2}. \tag{2.23}

 

Thus, if two random variables are jointly normally distributed, their joint probability law is completely determined from a knowledge of their first and second moments, since m 1 = E [ X 1 ] , m 2 = E [ X 2 ] , σ 1 2 = Var [ X 1 ] , σ 2 2 = Var [ X 2 2 ] , ρ σ 1 σ 2 = Cov [ X 1 , X 2 ] .

The foregoing notions may be extended to the case of n jointly distributed random variables, X 1 , X 2 , , X n . For any Borel function g ( x 1 , x 2 , , x n ) of n real variables, the expectation E [ g ( X 1 , X 2 , , X n ) ] of the random variable g ( X 1 , X 2 , , X n ) may be expressed in terms of the joint probability law of X 1 , X 2 , , X n .

If X 1 , X 2 , , X n are jointly continuous, with a joint probability density function f X 1 , X 2 , , X n ( x 1 , x 2 , , x n ) , it may be shown that

\begin{align} E\left[g\left(X_{1}, X_{2}, \ldots, X_{n}\right)\right] & =\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} g\left(x_{1}, x_{2}, \ldots, x_{n}\right) \tag{2.24} \\[3mm] & \times f_{X_{1}, X_{2}, \ldots, X_{n}}\left(x_{1}, x_{2}, \ldots, x_{n}\right) d x_{1} d x_{2} \cdots d x_{n}. \end{align} 

If X 1 , X 2 , , X n are jointly discrete, with a joint probability mass function p X 1 , X 2 , , X n ( x 1 , x 2 , , x n ) , it may be shown that

\begin{align} & \text { (2.25) } \quad E\left[g\left(X_1, X_2, \ldots, X_n\right)\right]= &\sum_{\substack{\text { over all }\left(x_1, x_2, \ldots, x_n\right) \text { such that } \\ p_{X_1, X_2, \ldots, X_n}\left(x_1, x_2, \ldots, x_n\right)>0}} g\left(x_1, x_2, \ldots, x_2, \ldots, x_n\right) p_{X_1, X_2, \ldots, X_n}\left(x_1, x_2, \ldots, x_n\right) \end{align}

The joint moment-generating function of n jointly distributed random variables is defined by

\psi_{X_{1}, X_{2}, \ldots, X_{n}}\left(t_{1}, t_{2}, \ldots, t_{n}\right)=E\left[e^{\left(t_{1} X_{1}+t_{2} X_{2}+\cdots+t_{n} X_{n}\right)}\right]. \tag{2.26} 

It may also be proved that if X 1 , X 2 , , X n and Y are random variables, such that Y = g 1 ( X 1 , X 2 , , X n ) for some Borel function g 1 ( x 1 , x 2 , , x n ) of n real variables, then for any Borel function g ( ) of one real variable

E[g(Y)]=E\left[g\left(g_{1}\left(X_{1}, X_{2}, \ldots, X_{n}\right)\right)\right]. \tag{2.27} 

Theoretical Exercises

Exercise 1.

2.1 . Linearity property of the expectation operation . Let X 1 and X 2 be jointly discrete random variables with finite means. Show that (2.7) holds.

Exercise 2.

2.2 . Let X 1 and X 2 be jointly distributed random variables whose joint momentgenerating function has a logarithm given by

\log \psi_{X_{1}, X_{2}}\left(t_{1}, t_{2}\right)=v \int_{-\infty}^{\infty} d u \int_{-\infty}^{\infty} d y f_{Y}(y)\left\{e^{y\left[t_{1} W_{1}(u)+t_{2} W_{2}(u)\right]}-1\right\} \tag{2.28}

 

in which Y is a random variable with probability density function f Y ( ) , W 1 ( ) and W 2 ( ) are known functions, and v > 0 . Show that

E [ X 1 ] = v E [ Y ] W 1 ( u ) d u , E [ X 2 ] = v E [ Y ] W 2 ( u ) d u ,

 

\begin{align} \operatorname{Var}\left[X_{1}\right] & =\nu E\left[Y^{2}\right] \int_{-\infty}^{\infty} W_{1}^{2}(u) d u, \tag{2.29}\ \operatorname{Var}\left[X_{2}\right] & =v E\left[Y^{2}\right] \int_{-\infty}^{\infty} W_{2}^{2}(u) d u, \ \operatorname{Cov}\left[X_{1}, X_{2}\right] & =\nu E\left[Y^{2}\right] \int_{-\infty}^{\infty} W_{1}(u) W_{2}(u) d u. \end{align}

Moment-generating functions of the form of (2.28) play an important role in the mathematical theory of the phenomenon of shot noise in radio tubes.

Exercise 3.

2.3 . The random telegraph signal . For t > 0 let X ( t ) = U ( 1 ) N ( t ) , where U is a discrete random variable such that P [ U = 1 ] = P [ U = 1 ] = 1 2 , { N ( t ) , t > 0 } is a family of random variables such that N ( 0 ) = 0 , and for any times t 1 < t 2 , the random variables U , N ( t 1 ) , and N ( t 2 ) N ( t 1 ) are independent. For any t 1 < t 2 , suppose that N ( t 2 ) N ( t 1 ) obeys (i) a Poisson probability law with parameter λ = v ( t 2 t 1 ) , (ii) a binomial probability law with parameters p and n = ( t 2 t 1 ) . Show that E [ X ( t ) ] = 0 for any t > 0 , and for any t 0 , τ 0  

\begin{align} E[X(t) X(t+\tau)] & = \begin{cases} e^{-2 \nu \tau}, & \text{Poisson case} \tag{2.30} \ (q-p)^{\tau}, & \text{binomial case}. \end{cases} \end{align}

Regarded as a random function of time, X ( t ) is called a “random telegraph signal”. Note: in the binomial case, t takes only integer values.

Exercises

Exercise 4.

2.1 . An ordered sample of size 5 is drawn without replacement from an urn containing 8 white balls and 4 black balls. For j = 1 , 2 , , 5 let X j be equal to 1 or 0, depending on whether the ball drawn on the jth draw is white or black. Find E [ X 2 ] , σ 2 [ X 2 ] , Cov [ X 1 , X 2 ] , Cov [ X 2 , X 3 ] .

 

Answer

Mean, 2 3 ; variance 2 9 , covariances 2 90 .

 

Exercise 5.

2.2 . An urn contains 12 balls, of which 8 are white and 4 are black. A ball is drawn and its color noted. The ball drawn is then replaced; at the same time 2 balls of the same color as the ball drawn are added to the urn. The process is repeated until 5 balls have been drawn. For j = 1 , 2 , , 5 let X j be equal to 1 or 0, depending on whether the ball drawn on the j th draw is white or black. Find E [ X 2 ] , σ 2 [ X 2 ] , Cov [ X 1 , X 2 ] .

Exercise 6.

2.3 . Let X 1 and X 2 be the coordinates of 2 points randomly chosen or the unit interval. Let Y = | X 1 X 2 | be the distance between the points. Find the mean, variance, and third and fourth moments of Y .

 

Answer

f Y ( y ) = 2 ( 1 y ) for 0 < y < 1 ; E [ Y ] = 1 3 , Var [ Y ] = 1 18 , E [ Y 3 ] = 1 10 , E [ Y 4 ] = 1 15 .

 

Exercise 7.

2.4 . Let X 1 and X 2 be independent normally identically distributed random variables, with mean m and variance σ 2 . Find the mean of the random variable Y = max ( X 1 , X 2 ) .

Hint : for any real numbers x 1 and x 2 show and use the fact that 2 max ( x 1 , x 2 ) = | x 1 x 2 | + x 1 + x 2 .

Exercise 8.

2.5 . Let X 1 and X 2 be jointly normally distributed with mean 0, variance 1, and covariance ρ . Find E [ max ( X 1 , X 2 ) ] .

 

Answer

( ( 1 p ) / π ) 1 / 5 .

 

Exercise 9.

2.6 . Let X 1 and X 2 have a joint moment-generating function

ψ X 1 , X 2 ( t 1 , t 2 ) = a ( e t 1 + t 2 + 1 ) + b ( e t 1 + e t 2 ) ,

 

in which a and b are positive constants such that 2 a + 2 b = 1 . Find E [ X 1 ] , E [ X 2 ] , Var [ X 1 ] , Var [ X 2 ] , Cov [ X 1 , X 2 ] .

Exercise 10.

2.7 . Let X 1 and X 2 have a joint moment-generating function

ψ X 1 , X 2 ( t 1 , t 2 ) = [ a ( e t 1 + t 2 + 1 ) + b ( e t 1 + e t 2 ) ] 2 ,

 

in which a and b are positive constants such that 2 a + 2 b = 1 . Find E [ X 1 ] , E [ X 2 ] , Var [ X 1 ] , Var [ X 2 ] , Cov [ X 1 , X 2 ] .

 

Answer

Means, 1; variances, 0.5; covariance, 2 a 0.5 .

 

Exercise 11.

2.8 . Let X 1 and X 2 be jointly distributed random variables whose joint momentgenerating function has a logarithm given by (2.28), with ν = 4 , Y uniformly distributed over the interval -1 to 1, and

\begin{align} & W_{1}(u) = \begin{cases} e^{-(u - a_{1})}, & \text{if } u \geq a_{1} \ 0, & \text{if } u < a_{1}, \end{cases} [3mm] & W_{2}(u) = \begin{cases} e^{-(u - a_{2})}, & \text{if } u >= a_{2} \ 0, & \text{if } u < a_{2}. \end{cases} \end{align}

in which a 1 , a 2 are given constants such that 0 < a 1 < a 2 . Find E [ X 1 ] , E [ X 2 ] , Var [ X 1 ] , Var [ X 2 ] , Cov [ X 1 , X 2 ] .

Exercise 12.

2.9 . Do exercise 2.8 under the assumption that Y is N ( 1 , 2 ) .

 

Answer

Means, 4; variances, 6; covariance, 6 e ( a 2 a 1 ) .