相依试验
In section 4 of Chapter 2 the notion of conditional probability was discussed for events defined on a sample description space on which a probability function was defined. However, an important use of the notion of conditional probability is to set up a probability function on the subsets of a sample description space , which consists of trials that are dependent (or, more correctly, nonindependent). In many applications of probability theory involving dependent trials one will state one’s assumptions about the random phenomenon under consideration in terms of certain conditional probabilities that suffice to specify the probability model of the random phenomenon.
As in section 2, for , let be the family of events on which depend on the th trial. Consider an event that may be written as the intersection, , of events , which belong to , respectively. Now suppose that a probability function has been defined on the subsets of and suppose that . Then, by the multiplicative rule given in theoretical exercise 1.4,
Now, as shown in section 2, any event that is a combinatorial product event may be written as the intersection of events, each depending on only one trial. Further, as we pointed out there, a probability function defined on the subsets of a space , consisting of trials, is completely determined by its values on combinatorial product events.
Consequently, to know the value of for any event it suffices to know, for , the conditional probability of any event depending on the th trial, given any events , depending on the 1 st, 2 nd, st trials, respectively; one also must know for any event depending on the first trial. In other words, if one assumes a knowledge of
for any events in in in , one has thereby specified the value of for any event on .
Example 4A . Consider an urn containing balls of which are white. Let a sample of size be drawn without replacement. Let us find the probability of the event that all the balls drawn will be white. The problem was solved in section 3 of Chapter 2; here, let us see how (4.1) may be used to provide insight into that solution. For let be the event that the ball drawn on the ith draw is white. We are then seeking . It is intuitively appealing that the conditional probability of drawing a white ball on the th draw, given that white balls were drawn on the preceding draws, is described for by
Further illustrations of the specification of a probability function on the subsets of a space of dependent trials by means of conditional probability functions of the form given in (4.1) are supplied in examples 4B and 4C.
Example 4B . Consider two urns; urn I contains five white and three black balls, urn II, three white and seven black balls. One of the urns is selected at random, and a ball is drawn from it. Find the probability that the ball drawn will be white.
Solution
The sample description space of the experiment described consists of 2-tuples , in which is the number of the urn chosen and is the “name” of the ball chosen. The probability function on the subsets of is specified by means of the functions listed in (4.1) , with , which the assumptions stated in the problem enable us to compute. In particular, let be the event that urn I is chosen, and let be the event that urn II is chosen. Then . Next, let be the event that a white ball is chosen. Then , and . The events and are the complements of each other. Consequently, by (4.5) of Chapter 2,
Example 4C . A case of hemophilia . 1 The first child born to a certain woman was a boy who had hemophilia. The woman, who had a long family history devoid of hemophilia, was perturbed about having a second child. She reassured herself by reasoning as follows. “My son obviously did not inherit his hemophilia from me. Consequently, he is a mutant. The probability that my second child will have hemophilia, if he is a boy, is consequently the probability that he will be a mutant, which is a very small number (equal to, say, )”. Actually, what is the conditional probability that a second son will have hemophilia, given that the first son had hemophilia?
Solution
Let us write a 3-tuple to describe the history of the mother and her two sons with regard to hemophilia. Let equal or , depending on whether the mother is or is not a hemophilia carrier. Let equal or , depending on whether the first son is or is not hemophilic. Let equal or , depending on whether the second son will or will not have hemophilia. On this sample description space, we define the events , and is the event that the mother is a hemophilia carrier, is the event that the first son has hemophilia, and is the event that the second son will have hemophilia. To specify a probability function on the subsets of , we specify all conditional probabilities of the form given in (4.1) :
In making these assumptions (4.6) we have used the fact that the woman has no family history of hemophilia. A boy usually carries an chromosome and a chromosome; he has hemophilia if and only if, instead of an chromsome, he has an
We are seeking . Now
To compute , we use the formula
since we may consider as approximately equal to 1 and as approximately equal to 0. To compute , we use the formula
Consequently,
Thus the conditional probability that the second son of a woman with no family history of hemophilia will have hemophilia, given that her first son has hemophilia, is approximately !
A very important use of the notion of conditional probability derives from the following extension of (4.5) . Let be events, each of positive probability, which are mutually exclusive and are also exhaustive (that is, the union of all the events is equal to the certain event). Then, for any event one may express the unconditional probability of in terms of the conditional probabilities , and the unconditional probabilities :
Example 4D . On drawing a sample from a sample. Consider a box containing five radio tubes selected at random from the output of a machine, which is known to be defective on the average (that is, the probability that an item produced by the machine will be defective is 0.2).
(i) Find the probability that a tube selected from the box will be defective.
(ii) Suppose that a tube selected at random from the box is defective; what is the probability that a second tube selected at random from the box will be defective?
Solution
To describe the results of the experiment that consists in selecting five tubes from the output of the machine and then selecting one tube from among the five previously selected, we write a 6-tuple , ); for is equal to or , depending on whether the th tube selected is defective or non-defective, whereas is equal to or , depending on whether the tube selected from those previously selected is defective or non-defective. For let denote the event that defective tubes were selected from the output of the machine.
Assuming that the selections were independent, . Let denote the event that the sixth tube selected from the box, is defective. We assume that ; in words, each of the tubes in the box is equally likely to be chosen. By (4.11), it follows that
To evaluate the sum in (4.13), we write it as
Let us next consider part (ii) of example 4D. To describe the results of the experiment that consists in selecting five tubes from the output of the machine and then selecting two tubes from among the five previously selected, we write a 7 -tuple , in which and denote the tubes drawn from the box containing the first five tubes selected. Let and be defined as before. Let be the event that the seventh tube is defective. We seek . Now, if two tubes, each of which has probability 0.2 of being defective, are drawn independently, the conditional probability that the second tube will be defective, given that the first tube is defective, is equal to the unconditional probability that the second tube will be defective, which is equal to 0.2. We now proceed to prove that . In so doing, we are proving a special case of the principle that a sample of size 2, drawn without replacement from a sample of any size whose members are selected independently from a given population, has statistically the same properties as a sample of size 2 whose members are selected independently from the population! More general statements of this principle are given in the theoretical exercises of section 4, Chapter 4 . We prove that under the assumption that for . Then, by (4.11),
Consequently, .
Bayes’s Theorem. There is an interesting consequence to (4.11) , which has led to much philosophical speculation and has been the source of much controversy. Let be mutually exclusive and exhaustive events, and let be an event for which one knows the conditional probabilities of , given , and also the absolute probabilities . One may then compute the conditional probability of any one of the events , given , by the following formula:
The relation expressed by (4.16) is called “Bayes’s theorem” or “Bayes’s formula”, after the English philosopher Thomas Bayes. 2 If the events are called “causes,” then Bayes’s formula can be regarded as a formula for the probability that the event , which has occurred, is the result of the “cause” . In this way (4.16) has been interpreted as a formula for the probabilities of “causes” or “hypotheses”. The difficulty with this interpretation, however, is that in many contexts one will rarely know the probabilities, especially the unconditional probabilities of the “causes,” which enter into the right-hand side of (4.16). However, Bayes’s theorem has its uses, as the following examples indicate. 3
Example 4E . Cancer diagnosis. Suppose, contrary to fact, there were a diagnostic test for cancer with the properties that , , in which denotes the event that a person tested has cancer and denotes the event that the test states that the person tested has cancer. Let us compute , the probability that a person who according to the test has cancer actually has it. We have
Let us assume that the probability that a person taking the test actually has cancer is given by . Then
One should carefully consider the meaning of this result. On the one hand, the cancer diagnostic test is highly reliable, since it will detect cancer in of the cases in which cancer is present. On the other hand, in only of the cases in which the test gives a positive result and asserts cancer to be present is it actually true that cancer is present! (This example is continued in exercise 4.8.)
Example 4F . Prior and posterior probability. Consider an urn that contains a large number of coins: Not all of the coins are necessarily fair. Let a coin be chosen randomly from the urn and tossed independently 100 times. Suppose that in the 100 tosses heads appear 55 times. What is the probability that the coin selected is a fair coin (that is, the probability that the coin will fall heads at each toss is equal to )?
Solution
To describe the results of the experiment we write a 101-tuple . The components are or , depending on whether the outcome of the respective toss is heads or tails. What are the possible values that may be assumed by the first component ? We assume that there is a set of numbers, , each between 0 and 1, such that any coin in the urn has as its probability of falling heads some one of the numbers . Having selected a coin from the urn, we let denote the probability that the coin will fall heads; consequently, is one of the numbers . Now, for let be the event that the coin selected has probability of falling heads, and let be the event that the coin selected yielded 55 heads in 100 tosses. Let be the number, 1 to , such that . We are now seeking , the conditional probability that the coin selected is a fair coin, given that it yielded 55 heads in 100 tosses. In order to use (4.16) to evaluate , we require a knowledge of and for . By the binomial law,
The probabilities cannot be computed but must be assumed. The probability represents the proportion of coins in the urn which has probability of falling heads. It is clear that the value we obtain for depends directly on the values we assume for . If the latter probabilities are unknown to us, then we must resign ourselves to not being able to compute . However, let us obtain a numerical answer for under the assumption that , so that a coin selected from the urn is equally likely to have any one of the probabilities . We then obtain that
Let us next assume that , and for . Then , and
The probability is called the prior (or a priori) probability of the event ; the conditional probability is called the posterior (or a posteriori) probability of the event . The prior probability is an unconditional probability that is known to us before any observations are taken. The posterior probability is a conditional probability that is of interest to us only if it is known that the conditioning event has occurred.
Our next example illustrates a controversial use of Bayes’s theorem.
Example 4G . Laplace’s rule of succession. Consider a coin that in independent tosses yields heads. What is the probability that
Solution
To describe the results of our observations, we write an
whereas
Let us now assume that is equal to and that . Then
The sums in (4.24) may be approximately evaluated in the case that is large by means of the integral calculus. The sums can be regarded as approximating sums of Riemann integrals, and we have
Consequently, given that the first tosses yielded a head, the conditional probability that
Equation (4.26) is known as Laplace’s general rule of succession. If we take
方程 (4.27) 被称为拉普拉斯特殊后继规则。
方程 (4.27) 被一些概率论作者解释为:如果一个理论在 次连续试验中得到验证,那么它在第 次试验中得到验证的概率是 / 。该规则在初次接触时具有一定的吸引力,可以从以下例子中看出:
考虑一位在外国城市几乎不懂当地语言的游客。他忐忑不安地选择了一家餐厅用餐。在那里吃了十顿饭之后,他没有感到任何不适。因此,他第十一次非常自信地前往这家餐厅,因为他知道,根据后继规则,他下一顿饭不会中毒的概率是 。
然而,很容易举出该规则导致荒谬答案的应用。一个男孩今天10岁。该规则说,既然他已经活了十年,他再活一年的概率是 。另一方面,他80岁的祖父再活一年的概率是 !然而,事实上,男孩再活一年的概率更大。
拉普拉斯给出了以下经常被引用的特殊后继规则的应用。“假设”,他说,“历史可以追溯到5000年前,即 天。太阳每天都升起,因此你可以以 比 1 的赔率打赌太阳明天会再次升起”。然而,在相信这一断言之前,问问你自己是否相信一般后继规则的以下推论;太阳在过去 天中的每一天都升起了,它在接下来 天中每天都升起的概率是 ,这意味着在接下来 天中至少有一天太阳不会升起的概率是 。
需要强调的是,贝叶斯公式和拉普拉斯后继规则是数学概率论中的真定理。前述例子绝没有对这些定理的有效性产生任何怀疑。相反,它们用于说明可以称之为应用概率论基本原理的东西:在应用一个定理之前,必须仔细考虑该定理的假设是否可以被认为得到满足。
理论练习
4.1 . 一个瓮中有 个球,其中 个是白球(其中 )。从该瓮中有放回[无放回]地抽取一个大小为 (其中 )的样本,并将其放入一个空瓮中。从第二个瓮中无放回地抽取一个大小为 (其中 )的样本。证明对于 ,第二个样本恰好包含 个白球的概率仍然由第2章的 (3.2) [(3.1)] 给出。该结果表明,正如人们所料,从较大样本中抽取大小为 的样本在统计上等价于从瓮中抽取大小为 的样本。该定理的另一种表述以及证明概要见第4章理论练习4.1。
4.2 . 考虑一个盒子,其中包含从一台机器输出中随机选择的 个电子管;已知该机器生产的物品是次品的概率 。
(i) 设 为整数。证明从盒子中随机选择的 个管子中有 个次品的概率由 给出:
(ii) 假设从盒子中随机选择 个管子并发现是次品。证明从盒子中剩余的 个管子中随机选择的 个管子包含 个次品的概率等于 。
(iii) 假设从盒子中随机选择 个管子并进行测试。你被告知至少有 个管子是次品;证明恰好有 个管子是次品的概率,其中 是从0到 的整数,由 (3.13) 给出。用文字表述此练习所隐含的结论。
4.3 . 考虑一个瓮,其中有 个球,其中 个是白球。设 是一个整数,使得 。从集合 中随机选择一个整数 ,然后从瓮中无放回地抽取一个大小为 的样本。证明样本中所有球都是白球的概率(令 )等于
4.4 . 贝叶斯定理的一个应用。 假设在回答一个多项选择题时,应试者要么知道答案,要么猜测。设 是他知道答案的概率,设 是他猜测的概率。假设对于知道答案的应试者,正确回答问题的概率为1,对于猜测的应试者,正确回答问题的概率为 ; 是多项选择选项的数量。证明在已知他正确回答了问题的条件下,应试者知道该问题答案的条件概率等于
4.5 . 差分方程的解。 差分方程
其中 和 是给定常数,出现在马尔可夫相依试验理论中(见第5节)。通过数学归纳法证明,如果一个数列 满足此差分方程,并且如果 ,那么
练习
4.1 . 瓮 I 中有5个白球和7个黑球。瓮 II 中有4个白球和2个黑球。求抽到白球的概率,如果 (i) 随机选择一个瓮,并从中抽一个球,(ii) 将两个瓮中的球倒入第三个瓮中,然后从中抽一个球。
答案
(i) ; (ii) 。
4.2 . 瓮 I 中有5个白球和7个黑球。瓮 II 中有4个白球和2个黑球。随机选择一个瓮,并从中抽一个球。已知抽到的球是白球,问选择的是瓮 I 的概率是多少?
4.3 . 一个人从一个装有4个白球和2个红球的瓮中抽一个球。如果球是白的,他不把它放回瓮中;如果球是红的,他把它放回。他再抽一个球。设 是第一次抽到的球是白球的事件,设 是第二次抽到的球是白球的事件。判断以下每个陈述的真假。(i) ,(ii) ,(iii) ,(iv) ,(v) 事件 和 互斥。(vi) 事件 和 独立。
答案
(i) ; (ii) ; (iii) ; (iv) ; (v) ; (vi) 。
4.4 . 从一个装有6个白球和4个黑球的瓮中,将5个球转移到一个空的第二个瓮中。从第二个瓮中将3个球转移到一个空盒子中。从盒子中抽一个球;结果是白球。问从第一个瓮转移到第二个瓮的球中恰好有4个是白球的概率是多少?
4.5 . 考虑一个装有12个球的瓮,其中8个是白球。有放回(无放回)地抽取一个大小为4的样本。然后,从大小为4的样本中随机选择一个球。求它是白球的概率。
答案
。
4.6 . 瓮 I 中有6个白球和4个黑球。瓮 II 中有2个白球和2个黑球。从瓮 I 中将2个球转移到瓮 II。然后从瓮 II 中无放回地抽取一个大小为2的样本。问样本恰好包含1个白球的概率是多少?
4.7 . 考虑一个盒子,其中包含从一台机器输出中随机选择的5个电子管,已知该机器平均有 的次品率(即该机器生产的物品是次品的概率为0.2)。假设从盒子中随机选择2个管子并进行测试。你被告知所选管子中至少有1个是次品;问两个管子都是次品的概率是多少?
答案
。
4.8 . 设事件 和 如例4E中所定义。设 且 。为使 , 必须取何值?解释你的答案。
4.9 . 在某所大学里,男生的地理分布如下: 来自东部, 来自中西部, 来自远西部。男生戴领带的比例如下:东部学生中 ,中西部学生中 ,远西部学生中 。问一个戴领带的学生来自东部的概率是多少?来自中西部呢?来自远西部呢?
答案
设学生戴领带、来自东部、来自中西部、来自远西部的事件分别记为 。则 。
4.10 . 考虑一个装有10个球的瓮,其中4个是白球。从集合 中随机选择一个整数 ,然后从瓮中无放回地抽取一个大小为 的样本。求样本中所有球都是白球的概率。
4.11 . 3个外观相同的盒子,每个有2个抽屉。盒子A的每个抽屉里都有一枚金币;盒子 的每个抽屉里都有一枚银币;盒子 的一个抽屉里有一枚金币,另一个抽屉里有一枚银币。选择一个盒子,打开它的一个抽屉,发现一枚金币。
(i) 另一个抽屉里有一枚银币的概率是多少?写出该实验的概率空间。为什么认为第二个抽屉里有一枚银币的概率是 的推理是错误的,因为那里可能发现的硬币类型有两种,金币或银币?
(ii) 选择的盒子是盒子 的概率是多少?盒子 呢?盒子 呢?
答案
(i) ; (ii) 盒子 ; 盒子 ; 盒子 , 。
4.12 . 三个囚犯,我们称之为 和 ,被狱卒告知他们中有一人被随机选中处决,另外两人将被释放。囚犯 学过概率论,他暗自推理自己被处决的概率是 。然后他要求狱卒私下告诉他哪位狱友将被释放,声称透露这个信息没有坏处,因为他已经知道至少有一人会被释放。狱卒(作为一个有道德的人)拒绝回答这个问题,指出如果 知道哪位狱友将被释放,那么他被处决的概率将增加到 ,因为他将成为两名囚犯中的一员,其中一人将被处决。证明即使狱卒回答了他的问题, 被处决的概率仍然是 ,假设在 将被处决的情况下,狱卒说 将被释放和说 将被释放的可能性相同。
4.13 . 一只雄性大鼠要么是纯合显性 ,要么是杂合子( ),根据孟德尔遗传特性,两者为真的概率均为 。该雄性大鼠与一只纯合隐性 (aa) 雌性大鼠交配。如果雄性大鼠是纯合显性,后代将表现出显性特征;如果是杂合子,后代有 的时间表现出显性特征, 的时间表现出隐性特征。假设所有3个后代都表现出显性特征。问该雄性大鼠是纯合显性的概率是多少?
答案
。
4.14 . 考虑一个装有5个白球和7个黑球的瓮。抽一个球并记下其颜色。然后将其放回;此外,再向瓮中加入3个与所抽球颜色相同的球。然后从瓮中再抽一个球。求 (i) 第二次抽到的球是黑球的概率,(ii) 两次抽到的球都是黑球的概率。
4.15 . 考虑按以下方式抽取一个大小为3的样本。从一个装有5个白球和7个红球的瓮开始。每次试验抽一个球并记下其颜色。然后将抽出的球放回瓮中,并额外加入一个相同颜色的球。求样本中恰好包含 (i) 0个白球,(ii) 1个白球,(iii) 3个白球的概率。
答案
(i) ; (ii) ; (iii) 。
4.16 . 某种核粒子分裂成0、1或2个新粒子(我们称之为后代),概率分别为 和 ,然后死亡。各个粒子相互独立地作用。给定一个粒子,设 表示其后代的数量,设 表示其后代的后代的数量,设 表示其后代的后代的后代的数量。
(i) 求 的概率。
(ii) 求在给定 的条件下, 的条件概率,
(iii) 求 的概率。
4.17 . 从整数集合 中随机选取一个数,记为 。再从集合 中随机选取第二个数,记为 。
(i) 对于每个整数 到 4,求在给定 的条件下, 的条件概率。
(ii) 求 的概率。
(iii) 求在给定 的条件下, 的条件概率。
答案
(i) ; (ii) ; (iii) 0.24。