用数学方法提出概率问题
概率论数学理论的基础原理如下:要谈论随机事件 的概率,必须首先建立一个定义该事件的概率空间。在本节中,我们将展示如何将应用概率论中经常出现的几个问题,表述为数学上适定的问题。所讨论的例子也说明了如何使用组合分析来解决在具有等可能描述的有限样本描述空间背景下提出的概率问题。
例 2A 。一个抽球问题。从一个装有六个球的罐子中有放回地(无放回地)抽取两个球,其中四个是白球,两个是红球。求以下概率:(i) 两个球都是白球,(ii) 两个球颜色相同,(iii) 至少有一个球是白球。
解
为了为所描述的实验建立数学模型,假设罐子中的球是可区分的;特别地,假设它们被编号为 1 到 6。让白球编号为 1 到 4,红球编号为 5 和 6。
我们首先考虑无放回地抽取球的情况。该实验的样本描述空间 则由第一章的 (3.1) 给出;更简洁地,我们写为
用文字来说,可以这样理解 (2.1) : 是所有 2-元组 的集合,其分量是 1 到 6 的任意数字,但受限于一个 2-元组的两个分量不能相等。描述的第 个分量 表示第 次抽取的球的号码。现在令 为抽出的两个球都是白球的事件,令 为抽出的两个球都是红球的事件,令 为抽出的球中至少有一个是白球的事件。那么手头的问题可以表述为求 (i) ,(ii) ,(iii) 。应注意 ,所以 。此外, 和 是互斥的,所以 。现在
在无放回抽样的情况下,例 中提出的问题的答案为:(i) ,(ii) ,(iii) 。这些概率是在假设罐子中的球可以被视为已编号(可区分)且 (2.1) 中给出的样本描述空间 中的所有描述都是等可能的情况下获得的。在有放回抽样的情况下,可以进行类似的分析;得到的答案为
将前述模型得到的值与另外两种可能模型得到的值进行比较是很有趣的。人们可能采用样本描述空间 。这个空间对应于将每次抽取的结果记录为 或 ,具体取决于抽取结果是白球还是红球。如果假设 中的所有描述都是等可能的,那么 , 。注意,这个模型给出的答案不依赖于抽样是有放回还是无放回。如果令 ,其中 0 表示没有抽到白球,1 表示恰好抽到一个白球,2 表示恰好抽到两个白球,也会得到类似的结论。
在假设 中的所有描述都是等可能的情况下,人们会得出 。
下一个例子说明了如何处理任意组成的罐子问题。如果读者考虑以下表述,它还会引出一个可能令读者惊讶的结论。假设在某个时间,一个自助市场的牛奶区已知有 150 夸脱瓶装牛奶,其中 100 瓶是新鲜的。如果假设每瓶被取到的可能性相等,那么从该区域取到一瓶新鲜牛奶的概率是 。然而,假设在其他五十个人每人选了一瓶之后,某人才去选一瓶。这个人取到新鲜牛奶的概率,与他第一个去取时的概率相比,会改变吗?通过例 2B 中使用的推理可以证明,第五十一瓶被取到是新鲜牛奶的概率,与第一瓶被取到是新鲜牛奶的概率相同。
例 2B 。一个任意组成的罐子。一个罐子装有 个球,其中 个是白球, 个是红球。有放回地(无放回地)抽取一个大小为 2 的样本。求以下概率:(i) 第一个抽出的球是白球,(ii) 第二个抽出的球是白球,(iii) 两个抽出的球都是白球?
解
令 表示第一个抽出的球是白球的事件, 表示第二个抽出的球是白球的事件, 表示两个抽出的球都是白球的事件。应注意 。将罐子中的球编号为 到 ,白球编号为 到 ,红球编号为 到 。
我们首先考虑 有放回抽样 的情况。该实验的样本描述空间 由有序 2-元组 组成,其中 是第一次抽取的球的号码, 是第二次抽取的球的号码。显然, 。为了计算 ,我们利用这样一个事实:一个描述属于 当且仅当其第一个分量是 1 到 的一个数字(意味着第一次抽到的是白球),并且其第二个分量是 1 到 的一个数字(由于是有放回抽样,第二次抽到的球的颜色不受第一次抽到白球这一事实的影响)。因此,对于 中描述的第一个分量有 种可能性,而对于这些可能性中的每一种,第二个分量有 种可能性。因此,根据 (1.1) , 的大小为 。类似地, ,因为对于 中描述的第一个分量有 种可能性,第二个分量有 种可能性。读者可以通过类似的论证验证事件 (两次都抽到白球)的大小为 。因此,在有放回抽样的情况下,如果所有描述都是等可能的,则得到结果
接下来我们考虑 无放回抽样 的情况。该实验的样本描述空间同样由有序 2-元组 组成,其中 (对于 )表示第 次抽取的球的号码。与有放回抽样的情况一样,每个 都是 1 到 的一个数字。然而,在无放回抽样中,一个描述 必须满足其分量不相同的条件。显然, 。接下来, ,因为对于 中描述的第一个分量有 种可能性,对于 中描述的第二个分量有 种可能性;从中抽取第二个球的罐子只包含 个球。为了计算 ,我们首先将注意力集中在 中描述的第二个分量上。因为 是第二次抽到的球是白球的事件,所以对于 中描述的第二个分量有 种可能性。对于这些可能性中的每一种,第一个分量只有 种可能性,因为将在第二次抽取的球对我们来说是已知的,并且不能在第一次抽取时被抽到。因此,根据 (1.1) , 。读者可以验证事件 的大小为 。因此,在无放回抽样中,如果所有描述都是等可能的,则得到结果
计算 的另一种方法如下,读者在初次接触概率论时可能会觉得这种方法更有说服力。令 表示第一次抽到的球是白球且第二次抽到的球是白球的事件。令 表示第一次抽到的球是红球且第二次抽到的球是白球的事件。显然, 。因为 ,我们有
为了说明 (2.5) 和 (2.6) 的用法,让我们考虑一个装有 个球的罐子,其中 个是白球。那么在有放回抽样中, 且 ,而在无放回抽样中, 且 。
读者可能会觉得 (2.6) 令人惊讶。在有放回抽样的情况下, ,即第二次抽到白球的概率与第一次抽到白球的概率相同,这是很自然的,因为两次抽取时罐子的组成是相同的。然而,在无放回抽样中 ,这似乎非常不自然,甚至令人难以置信。下面的说明可能会澄清 (2.6) 的含义。
假设人们希望将第二次抽到白球的事件视为定义在样本描述空间上的一个事件,该空间记为
我们接下来要考虑的例子是著名的 重复生日 问题的推广。假设一个人在一个有 个人的房间里。房间里没有两个人同一天生日的概率是多少?假设房间里的每个人都可以将一年 365 天中的任何一天作为其生日(忽略闰年的存在),并且一年中的每一天成为该人生日的可能性是相等的。那么为每个人选择一个生日,就等同于从一个装有 个编号为 1 到 365 的球的罐子中随机抽取一个数字。例 中表明,在一个有 个人的房间里,没有两个人同一天生日的概率由下式给出
| n | P n | Q n |
|---|---|---|
| 4 | 0.984 | 0.016 |
| 8 | 0.926 | 0.074 |
| 12 | 0.833 | 0.167 |
| 16 | 0.716 | 0.284 |
| 20 | 0.589 | 0.411 |
| 22 | 0.524 | 0.476 |
| 23 | 0.493 | 0.507 |
| 24 | 0.462 | 0.538 |
| 28 | 0.346 | 0.654 |
| 32 | 0.247 | 0.753 |
| 40 | 0.109 | 0.891 |
| 48 | 0.039 | 0.961 |
| 56 | 0.012 | 0.988 |
| 64 | 0.003 | 0.997 |
From Table 2A one determines a fact that many students find startling and completely contrary to intuition. How many people must there be in a room in order for the probability to be greater than 0.5 that at least two of them will have the same birthday? Students who have been asked this question have given answers as high as 100, 150, 365, and 730. In fact, the answer is 23!
Example 2C . The probability of a repetition in a sample drawn with replacement . Let a sample of size be drawn with replacement from an urn containing balls, numbered 1 to . Let denote the probability that there are no repetitions in the sample (that is, that all the numbers in the sample occur just once). Let us show that
The sample description space of the experiment of drawing with replacement a sample of size from an urn containing balls, numbered 1 to , is
The th component of a description represents the number of the ball drawn on the th draw. The event that there are no repetitions in the sample is the set of all -tuples in , none of whose components are equal. The size of is given by , since for any description in there are possibilities for its first component, possibilities for its second component, and so on. The size of is . If we assume that all descriptions in are equally likely, then (2.8) follows.
Example 2D . Repeated random digits . Another application of (2.8) is to the problem of repeated random digits . Consider the following experiment. Take any telephone directory and open it to any page. Choose 100 telephone numbers from the page. Count the numbers whose last four digits are all different. If it is assumed that each of the last four digits is chosen (independently) from the numbers 0 to 9 with equal probability, then the probability that the last four digits of a randomly chosen telephone number will be different is given by (2.8) , with and . The probability is .
The next example is concerned with a celebrated problem, which we call here the problem of matches . Suppose you are one of persons, each of whom has put his hat in a box. Each person then chooses a hat randomly from the box. What is the probability that you will choose your own hat? It seems reasonable that the probability of choosing one’s own hat should be , since one could have chosen any one of hats. However, one might prefer to adopt a more detailed model that takes account of the fact that other persons may already have selected hats. A suitable mathematical model is given in example 2E . In section 6 the model given in example 2E is used to find the probability that at least one person will choose his own hat. But whether the number of hats involved is 8,80, or , the rather startling result obtained is that the probability is approximately equal to that no man will choose his own hat and approximately equal to that at least one man will choose his own hat.
Example 2E . Matches (rencontres) . Suppose that we have urns, numbered 1 to , and balls, numbered 1 to . Let one ball be inserted in each urn. If a ball is put into the urn bearing the same number as the ball, a match is said to have occurred. In section 6 formulas are given (for each integer ) for the probability that exactly matches will occur. Here we consider only the problem of obtaining, for the probability of the event that a match will occur in the th urn. The probability corresponds, in the case of the persons selecting their hats randomly from a box, to the probability that the kth person will select his own hat.
To write the sample description space of the experiment of distributing balls in urns, let represent the number of the ball inserted in the th urn (for ). Then is the set of -tuples , in which each component is a number 1 to , but no two components are equal. The event is the set of descriptions in such that ; in symbols, . It is clear that ! and !. If it is assumed that all descriptions in are equally likely, then . Thus we have proved that the probability of a person’s choosing his own hat does not depend on whether he is the first, second, or even the last person to choose a hat.
Sample description spaces in which the descriptions are subsets and partitions rather than -tuples are systematically discussed in section 5 . The following example illustrates the ideas.
Example 2F . How to tell a prediction from a guess . In order to verify the contention of the existence of extrasensory perception, the following experiment is sometimes performed. Eight cards, four red and four black, are shuffled, and then each is looked at successively by the experimenter. In another room the subject of study attempts to guess whether the card looked at by the experimenter is red or black. He is required to say “black” four times and “red” four times. If the subject of the study has no extrasensory perception, what is the probability that the subject will “guess” correctly the colors of exactly six of eight cards? Notice that the problem is unchanged if the subject claimed the gift of “prophecy” and, before the cards were dealt, stated the order in which he expected the cards to appear.
Solution
Let us call the first card looked at by the experimenter card 1; similarly, for , let the th card looked at by the experimenter be called card . To describe the subject’s response during the course of the experiment, we write the subset of the numbers , which consists of the numbers of all the cards the subject said were red. The sample description space then consists of all subsets of size 4 of the set . Therefore, . The event that the subject made exactly six correct guesses may be represented as the set of those subsets , exactly three of whose members are equal to the numbers of cards that were, in fact, red. To compute the size of , we notice that the three numbers in a description in , corresponding to a correct guess, may be chosen in ways, whereas the one number in a description in , corresponding to an incorrect guess, may be chosen in ways. Consequently, , and
Exercises
In solving the following problems, state carefully any assumptions made. In particular, describe the probability space on which the events, whose probabilities are being found, are defined.
2.1 . Two balls are drawn with replacement (without replacement) from an urn containing 8 balls, of which 5 are white and 3 are black. Find the probability that (i) both balls will be white, (ii) both balls will be the same color, (iii) at least 1 of the balls will be white.
Answer
Without replacement, (i) , (ii) , (iii) ;
with replacement, (i) , (ii) , (iii) .
2.2 . An urn contains 3 red balls, 4 white balls, and 5 blue balls. Another urn contains 5 red balls, 6 white balls, and 7 blue balls. One ball is selected from each urn. What is the probability that (i) both will be white, (ii) both will be the same color?
2.3 . An urn contains 6 balls, numbered 1 to 6. Find the probability that 2 balls drawn from the urn with replacement (without replacement), (i) will have a sum equal to 7, (ii) will have a sum equal to , for each integer from 2 to 12.
Answer
2.4 . Two fair dice are tossed. What is the probability that the sum of the dice will be (i) equal to 7, (ii) equal to , for each integer from 2 to 12?
2.5 . An urn contains 10 balls, bearing numbers 0 to 9. A sample of size 3 is drawn with replacement (without replacement). By placing the numbers in a row in the order in which they are drawn, an integer 0 to 999 is formed. What is the probability that the number thus formed is divisible by 39? Note: regard 0 as being divisible by 39.
Answer
2.6 . Four probabilists arrange to meet at the Grand Hotel in Paris. It happens that there are 4 hotels with that name in the city. What is the probability that all the probabilists will choose different hotels?
2.7 . What is the probability that among the 32 persons who were President of the United States in the period 1789–1952 at least 2 were born on the same day of the year.
Answer
.
2.8 . Given a group of 4 people, find the probability that at least 2 among them have (i) the same birthday, (ii) the same birth month.
2.9 . Suppose that among engineers there are 12 fields of specialization and that there is an equal number of engineers in each field. Given a group of 6 engineers, what is the probability that no 2 among them will have the same field of specialization?
Answer
.
2.10 . Two telephone numbers are chosen randomly from a telephone book. What is the probability that the last digits of each are (i) the same, (ii) different?
2.11 . Two friends, Irwin and Danny, are members of a group of 6 persons who have placed their hats on a table. Each person selects a hat randomly from the hats on the table. What is the probability that (i) Irwin will get his own hat, (ii) both Irwin and Danny will get their own hats, (iii) at least one, either Irwin or Danny, will get his own hat?
Answer
(i) ; (ii) ; (iii) .
2.12 . 将两副等价的、各有52张不同卡牌的牌组随机排序(洗牌),然后通过同时从每副牌中依次翻出一张牌来进行配对。试求:(i) 第一张、(ii) 第52张从每副牌中翻出的牌相重合的概率是多少?从每副牌中翻出的第一张和第52张牌都相重合的概率是多少?
2.14 . 在其发表于《美国心理学杂志》第66卷(1953年),第349–364页的论文“赌博中的概率偏好”中,W. 爱德华兹讲述了一位来到华盛顿大学心理实验室的农夫的故事。这位农夫带来了一根雕刻的鲸骨,他声称能用它找到隐藏的水源。为检验农夫的说法,进行了如下实验。他被带进一个房间,房间里有10个盖着的罐子。他被告知,这10个罐子中有5个装有水,5个是空的。农夫的任务是将这些罐子分成数量相等的两组,一组包含所有装水的罐子,另一组包含那些没有水的罐子。仅凭运气,农夫至少将3个罐子正确归入有水组的概率是多少?