无序与分区样本——占位问题

我们在前面坚持认为,从瓮中抽取样本的实验应始终以能够明确区分第一个被抽出的球、第二个被抽出的球等方式进行。现在很明显,抽样并不一定需要以这种方式进行。尤其是在不放回抽样的情况下,样本中的球可以不是一次一个地从瓮中取出,而是一次性全部取出。例如,就像在桥牌游戏中,人们可以从一副牌中一次性抽出13张牌,并在全部收到并重新排列后再进行检查。如果从一个装有 M 个球(编号为1到 M )的瓮中一次性抽出 n 个球,那么该实验的结果是数字1到 M 的一个子集 { z 1 , z 2 , , z n } ,而不是一个其分量为数字1到 M n -元组 ( z 1 , z 2 , , z n )

因此,我们被引导去定义有序样本和无序样本的概念。如果关注数字(样本中球上的数字)出现的顺序,则称该样本为有序的。如果只关注样本中出现的数字,而不关注它们出现的顺序,则称该样本为无序的。从一个装有 M 个编号为1到 M 的球的瓮中抽取(放回或不放回)一个大小为 n 的有序样本的随机实验,其样本描述空间由 n -元组 ( z 1 , z 2 , , z n ) 组成,其中每个分量 z j 都是1到 M 之间的一个数字。从一个装有 M 个编号为1到 M 的球的瓮中抽取(放回或不放回)一个大小为 n 的无序样本的随机实验,其样本描述空间 S 由大小为 n 的集合 { z 1 , z 2 , , z n } 组成,其中每个元素 z j 都是1到 M 之间的一个数字。

例5A从一个装有四个球的瓮中抽取大小为3的所有可能无序样本。在例 1A 中,我们列出了从一个装有四个球的瓮中抽取(放回或不放回)一个大小为3的有序样本的随机实验的所有可能样本描述。现在我们列出所有可能的无序样本。如果抽样是不放回的,那么可以抽取的大小为3的可能无序样本是 { 1 , 2 , 3 } , { 1 , 2 , 4 } , { 1 , 3 , 4 } , { 2 , 3 , 4 }  

如果抽样是放回的,那么可以抽取的大小为3的可能无序样本是

{ 1 , 1 , 1 } { 2 , 2 , 2 } { 3 , 3 , 3 } { 4 , 4 , 4 }  
{ 1 , 1 , 2 } { 2 , 2 , 3 } { 3 , 3 , 4 }  
{ 1 , 1 , 3 } { 2 , 2 , 4 } { 3 , 4 , 4 }  
{ 1 , 1 , 4 } { 2 , 3 , 3 }   
{ 1 , 2 , 2 } { 2 , 3 , 4 }   
{ 1 , 2 , 3 } { 2 , 4 , 4 }   
{ 1 , 2 , 4 }    
{ 1 , 3 , 3 }    
{ 1 , 3 , 4 }    
{ 1 , 4 , 4 }    

接下来我们计算 S 的大小。在不放回抽取的无序样本情况下,显然 N [ S ] = ( M n ) ,因为大小为 n 的无序样本的数量与集合 { 1 , 2 , , M } 的大小为 n 的子集的数量相同。在放回抽取的无序样本情况下,可以证明(见理论练习 5.2 N [ S ] = ( M + n 1 n )

在第 3 节中,我们在假设样本是有序的前提下考虑了样本中成功次数的问题。现在假设从一个装有 M 个球(其中 M W 个是白球)的瓮中抽取一个大小为 n 无序样本。对于 k = 0 , 1 , , n ,我们来求事件 A k (即样本中恰好包含 k 个白球)的概率。我们首先考虑不放回抽样的情况。那么 N [ S ] = ( M n ) 。接下来, N [ A k ] = ( M W k ) ( M M W n k ) ,因为 A k 中的任何描述 { z 1 , z 2 , , z n } 包含 k 个白球,这有 ( M W k ) 种选择方式,以及 ( n k ) 个非白球,这有 ( M M W n k ) 种选择方式。因此,在不放回抽取的无序样本情况下, 

很容易验证,在不放回抽样的情况下,由无序样本模型给出的 P [ A k ] 的值与由有序样本模型给出的 P [ A k ] 的值是一致的。然而,在放回抽样的情况下,从一个装有 M 个球(其中 M W 个是白球)的瓮中抽取一个大小为 n 的无序样本,其中恰好包含 k 个白球的概率等于  

这与由有序样本模型给出的 P [ A k ] 的值不一致。

例5B将球分配到瓮中(占位问题)。假设我们有 M 个瓮,编号为1到 M ,我们要将 n 个球分配到这些瓮中,其中 n < M 。编号为1到 n 的每个瓮中恰好包含1个球的概率是多少?

 

解答

A 为事件:编号为1到 n 的每个瓮中恰好包含1个球。为了确定事件 A 所定义的概率空间,我们必须首先对 (i) 球的可区分性 和 (ii) 球的分配方式 做出假设。

 

如果球被视为可区分的(通过标记数字1到 n ),那么为了描述将 n 个球分配到 N 个瓮中的结果,可以写出一个 n -元组 ( z 1 , z 2 , , z n ) ,其第 j 个分量 z j 表示球 j 被放入的瓮的编号。如果球被视为完全相同的,因此是不可区分的,那么为了描述将 n 个球分配到 N 个瓮中的结果,可以写出一个大小为 n 的集合 { z 1 , z 2 , , z n } ,其中每个元素 z j 表示一个球被放入的瓮的编号。因此,在占位问题中,有序样本和无序样本分别对应于分配可区分的球和不可区分的球。 

接下来,在分配球时,可以施加也可以不施加一个排斥规则,其大意是:在分配球时,每个瓮中最多只能放入一个球。很明显,施加排斥规则等价于不放回地选择瓮的编号(抽样),因为每个瓮最多只能被选择一次。如果不施加排斥规则,即可以在任何一个瓮中放入任意数量的球,那么就是放回地选择瓮的编号(抽样)。

图2.4.1: 图像

现在让我们回到计算 P [ A ] 的问题。样本描述空间的大小在表 5A 中针对各种可能的情况给出。接下来,让我们确定 A 的大小。无论是否施加排斥规则,如果球是可区分的,我们得到 N [ A ] = n !;如果球是不可区分的,我们得到 N [ A ] = 1 。因此,如果球是可区分的且无排斥地分配, 如果球是不可区分的且无排斥地分配, 如果球是有排斥地分配,那么无论球被视为可区分的还是不可区分的,结果都没有区别,因为  

前面描述的占位问题的各种不同概率模型,都在统计物理学中找到了应用。假设我们试图确定一个由大量( n 个)性质相同的“粒子”(如电子、质子、光子、介子、中子等)组成的物理系统的平衡态。为简单起见,假设每个粒子可以处于 M 种微观状态(例如,一个粒子可以占据 M 个能级)。为了描述系统的宏观状态,假设只需给出 M 元组 ( n 1 , n 2 , , n M ) ,其第 j 个分量 n j 是处于第 j 种微观状态的“粒子”数。粒子系统的平衡态被定义为出现概率最高的那个宏观状态 ( n 1 , n 2 , , n M ) 。为了计算任何给定宏观状态的概率,必须假设粒子是否遵守泡利不相容原理(该原理指出,在任何微观状态中不能有多于一个粒子)。如果假设不可区分的粒子遵守不相容原理,则称它们具有费米-狄拉克统计。如果不要求不可区分的粒子遵守不相容原理,则称它们具有玻色-爱因斯坦统计。如果假设粒子是可区分的且不遵守不相容原理,则称它们具有麦克斯韦-玻尔兹曼统计。尽管物理粒子不能被认为是可区分的,但在某些情况下,麦克斯韦-玻尔兹曼统计可以作为玻色-爱因斯坦统计和费米-狄拉克统计的近似而正确使用。

定义在一般占位和抽样问题上的各种事件的概率总结在表 6A 中。

分割样本。如果我们考察某些纸牌游戏,可能会注意到另一种抽样类型。我们可以从一个瓮(或一副牌)中抽出 n 个可区分的球(或牌),然后将其分成若干个子集(在桥牌游戏中,分成四手牌)。更准确地说,我们可以指定一个正整数 r 和非负整数 k 1 , k 2 , , k r ,使得 k 1 + k 2 + + k r = n 。然后我们将大小为 n 的样本分成 r 个子集:第一个子集大小为 k 1 ,第二个子集大小为 k 2 , ,第 r 个子集大小为 k r 。例如,在桥牌游戏中,有四手牌(子集),每手牌大小为13,分别称为东、北、西、南(而不是第一、第二、第三和第四子集)。以这种方式抽取的样本的结果是一个子集的r元组 其第一个分量是第一个子集,第二个分量是第二个子集,…,第 r 个分量是第 r 个子集。我们将形式如 (5.6) 的样本称为一个分割样本,其分割方案 ( r : k 1 , k 2 , , k r )

例5C一个分割样本的例子。再次考虑从一个装有四个球(编号为1到4)的瓮中抽取大小为3的样本的实验。如果抽样是不放回的,并且样本被分割,分割方案为 ( 2 ; 1 , 2 ) ,那么可能被抽出的样本是

( { 1 } , { 2 , 3 } ) , ( { 2 } , { 1 , 3 } ) , ( { 3 } , { 1 , 2 } ) , ( { 4 } , { 1 , 2 } )  
( { 1 } , { 2 , 4 } ) , ( { 2 } , { 1 , 4 } ) , ( { 3 } , { 1 , 4 } ) , ( { 4 } , { 1 , 3 } )  
( { 1 } , { 3 , 4 } ) , ( { 2 } , { 3 , 4 } ) , ( { 3 } , { 2 , 4 } ) , ( { 4 } , { 2 , 3 } )  

If the sampling is done with replacement, and the sample is partitioned, with partitioning scheme ( 2 ; 1 , 2 ) , then the possible samples that could have been drawn are

( { 1 } , { 1 , 1 } ) , ( { 2 } , { 1 , 1 } ) , ( { 3 } , { 1 , 1 } ) , ( { 4 } , { 1 , 1 } )  
( { 1 } } 1 , 2 } ) , ( { 2 } , { 1 , 2 } ) , ( { 3 } , { 1 , 2 } ) , ( { 4 } , { 1 , 2 } )  
( { 1 } , { 1 , 3 } ) , ( { 2 } , { 1 , 3 } ) , ( { 3 } , { 1 , 3 } ) , ( { 4 } , { 1 , 3 } )  
( { 1 } , { 1 , 4 } ) , ( { 2 } , { 1 , 4 } ) , ( { 3 } , { 1 , 4 } ) , ( { 4 } , { 1 , 4 } )  
( { 1 } , { 2 , 2 } ) , ( { 2 } , { 2 , 2 } ) , ( { 3 } , { 2 , 2 } ) , ( { 4 } , { 2 , 2 } )  
( { 1 } , { 2 , 3 } ) , ( { 2 } , { 2 , 3 } ) , ( { 3 } , { 2 , 3 } ) , ( { 4 } , { 2 , 3 } )  
( { 1 } , { 2 , 4 } ) , ( { 2 } , { 2 , 4 } ) , ( { 3 } , { 2 , 4 } ) , ( { 4 } , { 2 , 4 } )  
( { 1 } , { 3 , 3 } ) , ( { 2 } , { 3 , 3 } ) , ( { 3 } , { 3 , 3 } ) , ( { 4 } , { 3 , 3 } )  
( { 1 } , { 3 , 4 } ) , ( { 2 } , { 3 , 4 } ) , ( { 3 } , { 3 , 4 } ) , ( { 4 } , { 3 , 4 } )  
( { 1 } , { 4 , 4 } ) , ( { 2 } , { 4 , 4 } ) , ( { 3 } , { 4 , 4 } ) , ( { 4 } , { 4 , 4 } )  

We next derive formulas for the number of ways in which partitioned samples may be drawn.

In the case of sampling without replacement from an urn containing M balls, numbered 1 to M , the number of possible partitioned samples of size n , with partitioning scheme ( r ; k 1 , k 2 , , k r ) , is equal to

Since there are ( M k 1 ) possible subsets of k 1 balls, ( M k 1 k 2 ) possible subsets of k 2 balls (there are M k 1 balls available from which to select the k 2 balls to go into the second subset), it follows that there are ( M k 1 k r 1 k r ) ways in which to select the r th subset.

In the case of sampling with replacement from an urn containing M balls, numbered 1 to M , the number of possible partitioned samples of size n , with partitioning scheme ( r ; k 1 , k 2 , , k r ) , is equal to  

The next example illustrates the theory of partitioned samples and provides a technique whereby card games such as bridge may be analyzed.

Example 5D . An urn contains fifty-two balls, numbered 1 to 52. Let the balls be drawn one at a time and divided among four players in the following manner: for j = 1 , 2 , 3 , 4 , balls drawn on trials numbered j + 4 k (for k = 0 , 1 , , 12 ) are given to player j . Thus player 1 gets the balls drawn on the first, fifth,…, forty-ninth draws, player 2 gets the balls drawn on the second, sixth, …, fiftieth draws, and so on. Suppose that the balls numbered 1, 11, 31, and 41 are considered “lucky”. What is the probability that each player will have a “lucky” ball?

 

Solution

Dividing the fifty-two balls drawn among four players in the manner described is exactly the same process as drawing, without replacement, a partitioned sample of size 52, with partitioning scheme ( 4 ; 13 , 13 , 13 , 13 ) . The sample description space S of the experiment being performed here consists of 4-tuples of mutually exclusive subsets, of size 13, of the numbers 1 to 52, in which (for j = 1 , 2 , , 4 ) the j th subset represents the balls held by the j th player. The size of the sample description space is the number of ways in which a sample of fifty-two balls, partitioned in the way we have described, may be drawn from an urn containing fifty-two distinguishable balls. Thus  

 

We next calculate the size of the event A that each of the four players will have exactly one “lucky” ball. First, consider a description in A that has the following properties: player 1 has ball number 11, player 2 has ball number 41, player 3 has ball number 1, and player 4 has ball number 31. Each description has forty-eight members about which nothing has been specified; consequently there are ( 48 ! ) ( 12 ) 4 descriptions, for in this many ways can the remaining forty-eight balls be distributed among the members of the description. Now the four “lucky” balls can be distributed among the four hands in 4! ways. Consequently,

 

and the probability that each player will possess exactly one “lucky” ball is given by the quotient of (5.10) and (5.9) .

The interested reader may desire to consider for himself the theory of partitions that are unordered, rather than ordered, arrays of subsets.

Theoretical Exercises

5.1 . An urn contains M balls, numbered 1 to M . A sample of size n is drawn without replacement, and the numbers on the balls are arranged in increasing order of their numbers: x 1 < x 2 < < x n . Let K be a number 1 to M , and k , a number 1 to n . Show the probability that x k = K is  

5.2 . The number of unordered samples with replacement . Let U ( M , n ) denote the number of unordered samples of size n that one may draw, by sampling with replacement, from an urn containing M distinguishable balls. Show that U ( M , n ) = ( M + n 1 n ) .

Hint: . To prove the assertion, make use of the principle of mathematical induction. Let P ( n ) be the proposition that, whatever M , U ( M , n ) = ( M + n 1 n ) . P(1) is clearly true, since there are M unordered samples of size 1. To complete the proof, we must show that P ( n ) implies P ( n + 1 ) . The following formula is immediately obtained: for any M = 1 , 2 , , and n = 1 , 2 , : U ( M , n + 1 ) = U ( M , n ) + U ( M 1 , n ) + + U ( 1 , n ) .  

To obtain this formula, let the balls be numbered 1 to M . Let each unordered sample be arrangéd so that the numbers of the balls in the sample are in non-decreasing order (as in the example in the text involving unordered samples of size 3 from an urn containing 4 balls). Then there are U ( M , n ) samples of size ( n + 1 ) whose first entry is 1 , U ( M 1 , n ) samples of size ( n + 1 ) whose first entry is 2, and so on, until there are U ( 1 , n ) whose first entry is M . Now, by the induction hypothesis, U ( k , n ) = ( k + n 1 n ) . Consequently, U ( k , n ) = ( k + n n + 1 ) ( k + n 1 n + 1 ) . We thus determine that U ( M , n + 1 ) = ( M + n n + 1 ) , so that P ( n + 1 ) is proved, and the asserted formula for U ( M , n ) is proved by mathematical induction.

5.3 . Show that the number of ways in which n indistinguishable objects may be arranged in M distinguishable cells is ( M + n 1 n ) = ( M + n 1 M 1 ) .

5.4 . Let n > M . Show that the number of ways in which n indistinguishable objects may be arranged in M distinguishable cells so that no cell will be empty is ( n 1 n M ) = ( n 1 M 1 ) . Hint: It suffices to find the number of ways in which ( n M ) indistinguishable objects may be arranged in M distinguishable cells, since after placing 1 object in each cell the remaining objects may be arranged without restriction.

Exercises

5.1 . On an examination the following question was posed: From a point on the base of a certain mountain there are 5 paths leading to the top of the mountain. In how many ways can one make a round trip (from the base to the top and back again)? Explain why each of the following 4 answers was graded as being correct: (i) ( 5 ) 2 = 20 , (ii) 5 2 = 25 , (iii) ( 5 2 ) = 10 , (iv) ( 6 2 ) = 15 .

5.2 . A certain young woman has 3 men friends. She is told by a fortune teller that she will be married twice and that both her husbands will come from this group of 3 men. How many possible marital histories can this woman have? Consider 4 cases. (May she marry the same man twice? Does the order in which she marries matter?)

5.3 . The legitimate theater in New York gives both afternoon and evening performances on Saturdays. A man comes to New York one Saturday to attend 2 performances (1 in the afternoon and 1 in the evening) of the living theater. There are 6 shows that he might consider attending. In how many ways can he choose 2 shows? Consider 4 cases.

 

Answer

( 6 ) 2 , 6 2 , ( 6 2 ) , ( 7 2 ) .

 

5.4 . An urn contains 52 balls, numbered 1 to 52. Let the balls be drawn 1 at a time and divided among 4 people. Suppose that the balls numbered 1 , 11 , 31 , and 41 are considered “lucky”. What is the probability that (i) each person will have a “lucky” ball, (ii) 1 person will have all 4 “lucky” balls?

5.5 . A bridge player announces that his hand (of 13 cards) contains (i) an ace (that is, at least 1 ace), (ii) the ace of hearts. What is the probability that it will contain another one?

 

Answer

(i) ( 52 13 ) ( 48 13 ) 4 ( 48 12 ) ( 52 13 ) ( 48 13 ) ; (ii) ( 51 12 ) ( 48 12 ) ( 51 12 ) .

 

5.6 . What is the probability that in a division of a deck of cards into 4 bridge hands, 1 of the hands will contain (i) 13 cards of the same suit, (ii) 4 aces and 4 kings, (iii) 3 aces and 3 kings?

5.7 . Prove that the probability of South’s receiving exactly k aces when a bridge deck is divided into 4 hands is the same as the probability that a hand of 13 cards drawn from a bridge deck will contain exactly k aces.

5.8 . An urn contains 8 balls numbered 1 to 8. Four balls are drawn without replacement; suppose x is the second smallest of the 4 numbers drawn. What is the probability that x = 3 ?

5.9 . A red card is removed from a bridge deck of 52 cards; 13 cards are then drawn and found to be the same color. Show that the (conditional) probability that all will be black is equal to 2 3 .

5.10 。一个房间里有10个人,他们佩戴着编号为1到10的徽章。如果随机选择3个人,求以下情况的概率:(i) 所选徽章编号中最大的是5;(ii) 所选徽章编号中最小的是5。

5.11 。从一副52张的扑克牌中抽取偶数张牌。证明这些牌中一半是红色、一半是黑色的概率为 ( 52 ! ( 26 ! ) 2 1 ) ÷ ( 2 51 1 ) . 提示。证明并利用(取 n = 52 )以下事实:对于任意整数 n