大数定律与中心极限定理

In the applications of probability theory to real phenomena two results of the mathematical theory of probability play a conspicuous role. These results are known as the law of large numbers and central limit theorem. At this point in this book we have sufficient mathematical tools available to show how to apply these basic results. In Chapters 9 and 10 we develop the additional mathematical tools required to prove these theorems with a sufficient degree of generality.

A set of $n$ observations $X_{1}, X_{2}, \dots, X_{n}$ are said to constitute a random sample of a random variable $X$ if $X_{1}, X_{2}, \dots, X_{n}$ are independent random variables, identically distributed as $X$ . Let

be the sum of the observations. Their arithmetic mean

is called the sample mean .

By (4.1), (4.6), and (4.7), we obtain the following expressions for the mean, variance, and moment-generating function of $S_{n}$ and $M_{n}$ , in terms of the mean, variance, and moment-generating function of $X$ (assuming these exist):

From (5.4) we obtain the striking fact that the variance of the sample mean $(1 / n) S_{n}$ tends to 0 as the sample size $n$ tends to infinity. Now, by Chebyshev’s inequality, it follows that if a random variable has a small variance then it is approximately equal to its mean, in the sense that with probability close to 1 an observation of the random variable will yield an observed value approximately equal to the mean of the random variable; in particular, the probability is 0.99 that an observed value of the random variable is within 10 standard deviations of the mean of the random variable. We have thus established that the sample mean of a random sample $X_{1}, X_{2}, \dots, X_{n}$ of a random variable, with a probability that can be made as close to 1 as desired by taking a large enough sample, is approximately equal to the ensemble mean $E [X]$ . This fact, known as the law of large numbers , was first established by Bernoulli in 1713 (see section 5 of Chapter 5). The validity of the law of large numbers is the mathematical expression of the fact that increasingly accurate measurements of a quantity (such as the length of a rod) are obtained by averaging an increasingly large number of observations of the value of the quantity. A precise mathematical statement and proof of the law of large numbers is given in Chapter 10.

However, even more can be proved about the sample mean than that it tends to be equal to the mean. One can approximately evaluate, for any interval about the mean, the probability that the sample mean will have an observed value in that interval, since the sample mean is approximately normally distributed. More generally, it may be shown that if $S_{n}$ is the sum of independent identically distributed random variables $X_{1}, X_{2}, \dots, X_{n}$ , with finite means and variances then, for any real numbers $a < b$

In words, (5.5) may be expressed as follows: the sum of a large number of independent identically distributed random variables with finite means and variances, nomalized to have mean zero and variance 1, is approximately normally distributed . Equation (5.5) represents a rough statement of one of the most important theorems of probability theory. In 1920 G. Polya gave this theorem the name “the central limit theorem of probability theory”. This name continues to be used today, although a more apt description would be “the normal convergence theorem”. The central limit theorem was first proved by De Moivre in 1733 for the case in which $X_{1}, X_{2}, \dots, X_{n}$ are Bernoulli random variables, so that $S_{n}$ is then a binomial random variable. A proof of (5.5) in this case (with a continuity correction) was given in section 2 of Chapter 6. The determination of the exact conditions for the validity of (5.5) constituted the outstanding problem of probability theory from its beginning until the decade of the 1930’s. A precise mathematical statement and proof of the central limit theorem is given in Chapter 10 .

It may be of interest to outline the basic idea of the proof of (5.5), even though the mathematical tools are not at hand to justify the statements made. To prove (5.5) it suffices to prove that the moment-generating function

satisfies for $t$ in a neighborhood of 0

in which $t^{2} / 2$ is the logarithm of the moment-generating function of a random variable $X$ , which is $N (0, 1)$ . Now, expanding in Taylor series,

where the remainder $A (u)$ satisfies the condition $lim_{u \to 0} A (u) / u^{2} = 0$ . Similarly, $\log (1 + v) = v + B (v)$ where $lim_{v \to 0} B (v) / v = 0$ . Consequently one may show that for values of $u$ sufficiently close to 0

where

It then follows that

where

From (5.11) and (5.12) one obtains (5.7). Our heuristic outline of the proof of (5.5) is now complete.

Given any random variable $X$ with finite mean and variance, we define its standardization , denoted by $X^{*}$ , as the random variable

The standardization $X^{*}$ is a dimensionless random variable, with mean $E [X^{*}] = 0$ and variance $σ^{2} [X^{*}] = 1$ .

The central limit theorem of probability theory can now be formulated: The standardization ${(S_{n})}^{*}$ of the sum $S_{n}$ of a large number $n$ of independent and identically distributed random variables is approximately normally distributed . In Chapter 10 it is shown that this result may be considerably extended to include cases in which $S_{n}$ is the sum of dependent nonidentically distributed random variables.

Example 5A . Reliability . Evaluation of the reliability of rockets is a problem of obvious importance in the space age. By the reliability of a rocket one means the probability $p$ that an attempted launching of the rocket will be successful. Suppose that rockets of a certain type have, by many tests, been established as $90 %$ reliable. Suppose that a modification of the rocket design is being considered. Which of the following sets of evidence throws more doubt on the hypothesis that the modified rocket is only $90 %$ reliable: (i) of 100 modified rockets tested, 96 performed satisfactorily, (ii) of 64 modified rockets tested, 62 (equal to $96.9 %$ ) performed satisfactorily.

Solution

Let $S_{1}$ be the number of rockets in the group of 100 which performed satisfactorily, and let $S_{2}$ be the number of rockets in the group of 64 which performed satisfactorily. If $p$ is the reliability of a rocket, then $S_{1}$ and $S_{2}$ have standardizations (since $S_{1}$ and $S_{2}$ have binomial distributions):

${(S_{1})}^{*} = \frac{S_{1} - 100 p}{10 \sqrt{p q}}, {(S_{2})}^{*} = \frac{S_{2} - 64 p}{8 \sqrt{p q}} .$

If $p = 0.9, S_{1} = 96$ , and $S_{2} = 62$ , then ${(S_{1})}^{*} = 2$ and ${(S_{2})}^{*} = 1 \frac{5}{6}$ . If ${(S_{1})}^{*}$ is $N (0, 1)$ , the probability of observing a value of ${(S_{1})}^{*}$ greater than or equal to 2 is 0.023. If ${(S_{2})}^{*}$ is $N (0, 1)$ , the probability of observing a value of ${(S_{2})}^{*}$ greater than or equal to 1.83 is 0.034. Consequently, scoring 96 successes in 100 tries is better evidence than scoring 62 successes in 64 tries for the hypothesis that the modified rocket has a higher reliability than the original rocket.

Example 5B . Brownian motion and random walk . A particle (of diameter $10^{- 4}$ centimeter, say) immersed in a liquid or gas exhibits ceaseless irregular motions that are discernible under the microscope. The motion of such a particle is called Brownian, after the English botanist Robert Brown, who noticed the phenomenon in 1827. The same phenomenon is also exhibited in striking fashion by smoke particles suspended in air. The explanation of the phenomenon of Brownian motion was one of the major successes of statistical mechanics and kinetic theory. In 1905 Einstein showed that the Brownian motion could be explained by assuming that the particles are subject to the continual bombardment of the molecules of the surrounding medium. The theoretical results of Einstein were soon confirmed by the exact experimental work of Perrin. To appreciate the importance of these events, the reader should be aware that in the years around 1900 atoms and molecules were far from being accepted as they are today-there were still physicists who did not believe in them. After Einstein’s work this was possible no longer (see Max Born, Natural Philosophy of Cause and Chance , Oxford, 1949, p.63). If we let $S_{t}$ denote the displacement after $t$ minutes of a particle in Brownian motion from its starting point, Einstein showed that $S_{t}$ has probability density function

in which $D$ is a constant, called the diffusion coefficient , which depends on the absolute temperature and friction coefficient of the surrounding medium. In words, $S_{t}$ is normally distributed with mean 0 and variance

The result given by (5.15) is especially important; it states that the mean square displacement $E [S_{t}^{2}]$ of a particle in Brownian motion is proportional to the time $t$ . A model for Brownian motion is provided by a particle undergoing a random walk. Let $X_{1}, X_{2}, \dots, X_{n}$ be independent random variables, identically distributed as a random variable $X$ , which has mean $E [X] = 0$ and finite variance $E [X^{2}]$ . The sum $S_{n} = X_{1} + X_{2} + \dots$ $+ X_{n}$ represents the displacement from its starting position of a point (or particle) performing a random walk on a straight line by taking at the $k$ th step a displacement $X_{k}$ . After $n$ steps, the total displacement $S_{n}$ has a mean and mean square given by

Thus the mean-square displacement of a particle undergoing a random walk is proportional to the number of steps $n$ . Since $S_{n}$ is approximately normally distributed in the sense that (5.5) holds, it might be thought that the probability density function of $S_{n}$ is approximately given by

in which $B = E [X^{2}]$ . However, (5.17) represents a stronger conclusion than (5.5). Equation (5.17) is a normal convergence theorem for probability density functions, whereas (5.5) is a normal convergence theorem for distribution functions; (5.17) implies (5.5), but the converse is not true. It may be shown that a sufficient condition for the validity of (5.17) is that the random variable $X$ possesses a square integrable probability density function. From the fact that $S_{n}$ is approximately normally distributed in the sense that (5.5) holds it follows that it is very improbable that a value of $S_{n}$ will be observed more than 3 or 4 standard deviations from its mean. Consequently, in a random walk in which the individual steps have mean 0 it is very unlikely after $n$ steps that the distance from the origin will be greater than $4 σ [X] \sqrt{n}$ .

Exercises

5.1. Which of the following sets of evidence throws more doubt on the hypothesis that new born babies are as likely to be boys as girls: (i) of 10,000 new born babies, 5100 are male; (ii) of 1000 new born babies, 510 are male.

Answer

(i) throws more doubt than (ii).

5.2. The game of roulette is described in example 1D. Find the probability that the total amount of money lost by a gambling house on 100,000 bets made by the public on an odd outcome at roulette will be negative.

5.3. As an estimate of the unknown mean $E [X]$ of a random variable, it is customary to take the sample mean $\bar{X} = (X_{1} + X_{2} + \dots + X_{n}) / n$ of a random sample $X_{1}, X_{2}, \dots, X_{n}$ of the random variable $X$ . How large a sample should one observe if there is to be a probability of at least 0.95 that the sample mean $\bar{X}$ will not differ from the true mean $E [X]$ by more than $25 %$ of the standard deviation $σ [X]$ ?

Answer

62.

5.4. A man plays a game in which his probability of winning or losing a doliar is $\frac{1}{2}$ . Let $S_{n}$ be the man’s fortune (that is, the amount he has won or lost) after $n$ independent plays of the game.

(i) Find $E [S_{n}]$ and $Var [S_{n}]$ . Hint : Write $S_{n} = X_{1} + \dots + X_{n}$ , in which $X_{i}$ is the change in the man’s fortune on the $i$ th play of the game.

(ii) Find approximately the probability that after 10,000 plays of the game the change in the man’s fortune will be between -50 and 50 dollars.

5.5. Consider a game of chance in which one may win 10 dollars or lose $1, 2, 3$ , or 4 dollars; each possibility has probability 0.20. How many times can this game be played if there is to be a probability of at least $95 %$ that in the final outcome the average gain or loss per game will be between -2 and +2?

答案

25或更多。

5.6. 某赌徒的日收入（以美元计）是一个随机变量 $X$ ，在区间-3到3上均匀分布。

(i) 求在独立赌博100天后，他赢得的钱超过200美元的概率近似值。

(ii) 求量 $A$ ，使得该赌徒在100个独立赌博日中的赢利（可能为负）大于 $A$ 的概率大于 $95 %$ 。
(iii) 确定该赌徒可以赌博的天数，使得他在这些天里的总赢利绝对值小于180美元的概率大于 $95 %$ 。

5.7. 将100个实数相加，每个数都四舍五入到最接近的整数。假设每个舍入误差是一个在 $- \frac{1}{2}$ 和 $\frac{1}{2}$ 之间均匀分布的随机变量，且这100个舍入误差相互独立。求和的误差在-3到3之间的概率近似值。求量 $A$ ，使得和的误差绝对值小于 $A$ 的概率近似为 $99 %$ 。

答案

$0.70; 7.4$ 。

5.8. 如果一根绳索中的每一股都有断裂强度，其均值为20磅，标准差为2磅，且一根绳索的断裂强度是所有（独立）股线断裂强度之和，那么由64股组成的绳索能支撑(i) 1280磅，(ii) 1240磅重量的概率是多少？

5.9. 一辆送货卡车装载着成箱的物品。如果每箱的重量是一个随机变量，均值为50磅，标准差为5磅，那么卡车能装载多少箱，才能使总装载量超过1吨的概率小于 $5 %$ ？请说明所做的任何假设。

答案

38。

5.10. 考虑由一台机器生产的灯泡，其寿命 $X$ （以小时计）是一个服从指数概率律的随机变量，平均寿命为1000小时。

(i) 求从该机器产出中随机抽取的100个灯泡样本中，寿命超过1020小时的灯泡数量在30到40个之间的概率近似值。

(ii) 求从该机器产出中随机抽取的100个灯泡的寿命总和小于110,000小时的概率近似值。

5.11. 称为高尔顿钉板的装置在第6章练习2.10中描述。假设从一行传到下一行时，小球横坐标的变化量 $X$ 是一个随机变量，具有以下概率律： $P [X = \frac{1}{2}] = P [X = - \frac{1}{2}] = \frac{1}{2} - η, P [X = \frac{3}{2}] = P [X = - \frac{3}{2}] =$ $η$ ，其中 $η$ 是一个未知常数。在一个由100行组成的钉板实验中，发现插入装置的小球中有 $80 %$ 通过了最后一行的21个中央开口（即横坐标为 $0, \pm 1, \pm 2, \dots, \pm 10$ 的开口）。确定与此结果一致的 $η$ 值。

答案

$η = 0.10$ 。

5.12. 某人将总共 $N$ 美元投资于一组 $n$ 种证券，这些证券的回报率（利率）分别是独立的随机变量 $X_{1}, X_{2}, \dots, X_{n}$ ，其均值分别为 $i_{1}, i_{2}, \dots, i_{n}$ ，方差分别为 $σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{n}^{2}$ 。如果该人在第 $j$ 种证券上投资 $N_{j}$ 美元，那么他在这个特定投资组合上的美元回报是一个随机变量 $R$ ，由 $R = N_{1} X_{1} + N_{2} X_{2} + \dots + N_{n} X_{n}$ 给出。设 $R$ 的标准差 $σ [R]$ 作为选择给定证券组合所涉及风险的度量。特别地，让我们考虑将5500美元投资分配于两种证券的问题，其中一种证券的回报率为 $X_{1}$ ，均值为 $6 %$ ，标准差为 $1 %$ ；另一种证券的回报率为 $X_{2}$ ，均值为 $15 %$ ，标准差为 $10 %$ 。

(i) 如果希望将风险降至最低，应在两种证券上分别投资多少金额 $N_{1}$ 和 $N_{2}$ ？该投资组合回报的均值和方差是多少？

(ii) 为了获得一个平均回报等于400美元的投资组合，必须承担的风险量是多少？

(iii) 利用切比雪夫不等式，找到一个关于400美元对称的区间，使得回报 $R$ 来自平均回报为 $E [R] = 400$ 美元的投资组合的概率大于 $75 %$ 。假设回报 $R$ 近似服从正态分布是否合理？