随机样本、随机选取点(几何概率)与区间随机划分
现在汇集的概念使我们能够解释数学概率论中赋予“随机”一词的一些主要含义。
一种含义与随机变量的随机样本的概念相关。考虑一个随机变量 ,可以对其进行重复测量,记为 。例如, 可能是 个电灯泡中每个的寿命,或者它们可能是从一个装有编号为 1 到 100 的球的瓮中抽取(有放回或无放回)的球上的数字,等等。这组 个测量值 被称为随机变量 的一个容量为 的样本,这意味着每个测量值 (对于 )都是一个随机变量,其分布函数 作为 的函数,等于随机变量 的分布函数 。如果进一步,随机变量 是相互独立的,那么我们称 是随机变量 的一个容量为 的随机样本(或独立样本)。因此,形容词“随机”在用于描述一个随机变量的样本时,表明该样本的成员是独立同分布的随机变量。
例 7A 。假设已知某种类型电子管的寿命(以小时计)近似服从参数为 和 的正态分布。一个包含四只管子的随机样本中,没有管子寿命低于 180 小时的概率是多少?
解
令 和 分别表示样本中四只管子的寿命。这些管子构成一个服从参数为 和 的正态分布随机变量的随机样本这一假设,应被解释为假设随机变量 和 是相互独立的,且对于 ,各自的概率密度函数为
样本中每只管子的寿命大于或等于 180 小时的概率由下式给出
因为 。
“随机”一词的第二种含义出现在它用于描述从有限总体中抽取的样本时。如果一个样本的每个组成部分都来自一个有限总体,并且在每次抽取时,所有可供选择的候选者被选中的概率相等,则称该样本为随机样本。在整个第 2 章中,“随机”一词都是在这个意义上使用的。
例 7B 。如同例 7A,考虑某种类型的电子管,其寿命服从参数为 和 的正态分布。将一个包含四只管子的随机样本放入一个盒子中。从盒子中随机选取一只管子。所选管子寿命大于 180 小时的概率是多少?
随机一词还有第三种常见的含义。短语“从区间 到 中随机选取的一个点”被简洁地用来描述一个在区间 到 上服从均匀概率律的随机变量,而短语“从区间 到 中随机选取的 个点”则被简洁地用来描述 个在区间 到 上服从均匀概率律的独立随机变量。涉及随机选取点的问题长期以来被概率论学者在“几何概率”的标题下讨论。在现代术语中,涉及几何概率的问题可以被表述为涉及独立随机变量的问题,每个变量都服从均匀概率律。
例 7C 。在一条长度为 的直线上随机选取两点,使其位于直线中点的两侧。求它们之间的距离小于 的概率。
解
在直线上引入一个坐标系,使其左端点为 0,右端点为 。令 为在区间 0 到 中随机选取的点的坐标,令 为在区间 到 中随机选取的点的坐标。我们假设随机变量 和 是相互独立的,且各自在其区间上服从均匀概率律。那么 和 的联合概率密度函数为

例 7D 。再次考虑例 7C 中定义的随机变量 和 。求这三条线段(从 0 到 ,从 到 ,以及从 到 )能够构成一个三角形三条边的概率。
解
为了使所提到的三条线段能够构成一个三角形,必须且只需满足以下不等式(为什么?):

涉及几何概率的问题在现代概率概念的发展中扮演了重要角色。在十九世纪,拉普拉斯式的概率定义被广泛接受。人们认为,通过找到适当的“等可能”描述框架,概率问题可以得到唯一解。为了反驳这一观点,人们构造了一些例子,这些例子允许多种同样合理但互不相容的解。我们现在讨论一个类似于约瑟夫·贝特朗在其著作《概率计算》(巴黎,1889年,第4页)中给出的例子,该例子后来被庞加莱称为“贝特朗悖论”。作者的一位学生曾指出,这个例子应作为一个警示,提醒所有基于理论解来采纳实际政策的人,必须首先确保解所依据的假设与实验观察到的事实相符。
基于理论解来采纳实际政策的人,必须首先确保解所依据的假设与实验观察到的事实相符。
例 7E 。贝特朗悖论。在一个半径为 的圆中随机选取一条弦。弦的长度 小于半径 的概率是多少?
解
“随机选取的弦”的含义并不明确。为了赋予这个短语意义,我们将该问题重新表述为一个涉及随机选取点的问题。我们将陈述两种通过随机选取点来确定一条弦的方法。通过这种方式,我们得到了随机选取的弦的长度 小于半径 的概率 的两个不同答案。
一种方法如下:令 和 分别为在区间 0 到 和区间 0 到 中随机选取的点。通过令 为弦与一条固定参考线所成的角度,并令 为弦的中点到圆心的(垂直)距离来画一条弦(见图 7C)。随机选取弦的第二种方法如下:令 和 分别为在区间 0 到 和区间 0 到 中随机选取的点。通过令 和 为图 7D 中所示的角度来画一条弦。读者或许能想到其他通过选取点来确定弦的方法。贝特朗悖论的六种不同解法见 Czuber 的《概率论》,B. G. Teubner,莱比锡,1908年,第106–109页。


弦的长度 可以用随机变量 和 表示:
应当注意,可以以这样的方式进行随机实验,使得根据频率定义的概率, (7.10) 或 (7.11) 都可能是正确的概率。如果从硬纸板上剪下一个直径为 的圆盘,并将其随机扔到一张画有间距为 的平行线的桌子上,那么这些线中有一条且仅有一条会穿过该圆盘。所有到圆心的距离都是等可能的,并且 (7.10) 将表示穿过圆盘的直线所画出的弦的长度小于 的概率。另一方面,如果圆盘通过其边缘上的一点(该点位于某条直线上)用枢轴固定,并绕该点随机旋转,那么 (7.11) 将表示穿过圆盘的直线所画出的弦的长度小于 的概率。
下面的例子有许多重要的扩展和实际应用。
例 7F 。不拥挤道路的概率。沿着一条 英里长的笔直道路,有 个可区分的行人随机分布。证明:对于满足 的 ,任意两人之间的距离小于 英里的概率等于
Solution
For let denote the position of the th person. We assume that are independent random variables, each uniformly distributed over the interval 0 to . Their joint probability density function is then given by
Next, for each permutation, or ordered -tuple chosen without replacement, of the integers 1 to , define
Thus is a zone of points in -dimensional Euclidean space. There are ! such zones that are mutually exclusive. The union of all zones does not include all the points in -dimensional space, since an -tuple that contains two equal components does not lie in any zone. However, we are able to ignore the set of points not included in any of the zones, since this set has probability zero under a continuous probability law. Now the event that no two persons are less than a distance apart may be represented as the set of -tuples for which the distance between any two components is greater than . To find the probability of , we must first find the probability of the intersection of and each zone . We may represent this intersection as follows:
Consequently,
in which we have made the change of variables , and have set
The probability of is equal to the product of ! and the probability of the intersection of and any zone . The proof of (7.12) is now complete.
In a similar manner one may solve the following problem.
Example 7G . Packing cylinders randomly on a rod . Consider a horizontal rod of length on which equal cylinders, each of length , are distributed at random. The probability that no two cylinders will be less than apart is equal to, for such that ,
The foregoing considerations, together with (6.2) of Chapter 2, establish an extremely useful result.
The Random Division of an Interval or a Circle . Suppose that a straight line of length is divided into sub-intervals by points chosen at random on the line or that a circle of circumference is divided into sub-intervals by points chosen at random on the circle. Then the probability that exactly of the sub-intervals will exceed in length is given by
It may clarify the meaning of (7.19) to express it in terms of random variables. Let be the coordinates of the points chosen randomly on the line (a similar discussion may be given for the circle.) Then are independent random variables, each uniformly distributed on the interval 0 to . Define new random variables is equal to the minimum of ; is equal to the second smallest number among ; and, so on, up to , which is equal to the maximum of . The random variables thus constitute a reordering of the random variables , according to increasing magnitude. For this reason, the random variables are called the order statistics corresponding to . The random variable , for , is usually spoken of as the th smallest value among .
The lengths of the successive subintervals into which the randomly chosen points divide the line may now be expressed:
The probability is the probability that exactly of the events will occur. To prove (7.19), one needs only to verify that for any integer the probability that specified subintervals will exceed in length is equal to
References to the large variety of problems to which (7.19) is applicable may be found in two papers: J. O. Irwin, “A Unified Derivation of Some Well-known Frequency Distributions of Interest in Biometry and Statistics,” Journal of the Royal Statistical Society A , Vol. 118 (1955), pp. 389398, and L. Takacs, “On a general probability theorem and its application in the theory of stochastic processes”, Proceedings of the Cambridge Philosophical Society, Vol. 54 (1958), pp. 219–224.
Theoretical Exercises
7.1 . Buffon’s Needle Problem . A smooth table is ruled with equidistant parallel lines at distance apart. A needle of length is dropped on the table. Show that the probability that it will cross one of the lines is . For an account of some experiments made in connection with the Buffon Needle Problem see J. V. Uspensky, Introduction to Mathematical Probability , McGraw-Hill, New York, 1937, pp. 112–113.
7.2 . A straight line of unit length is divided into subintervals by points chosen at random. For , show that the probability that none of specified subintervals will be less than in length is equal to
Exercises
7.1 . A young man and a young lady plan to meet between 5 and 6 P.M., each agreeing not to wait more than 10 minutes for the other. Find the probability that they will meet if they arrive independently at random times between 5 and 6 P.M.
Answer
.
7.2 . Consider light bulbs produced by a machine for which it is known that the life in hours of a light bulb produced by the machine is a random variable with probability density function
Consider a box containing 100 such bulbs, selected randomly from the output of the machine.
(i) What is the probability that a bulb selected randomly from the box will have a lifetime greater than 1020 hours?
(ii) What is the probability that a sample of 5 bulbs selected randomly from the box will contain (a) at least 1 bulb, (b) 4 or more bulbs with a lifetime greater than 1020 hours?
(iii) Find approximately the probability that the box will contain between 30 and 40 bulbs, inclusive, with a lifetime greater than 1020 hours.
7.3 . Six soldiers take up random positions on a road 2 miles long. What is the probability that the distance between any two soldiers will be more than (i) , (ii) , (iii) of a mile?
Answer
(i) 0; (ii) ; (iii) .
7.4 . Another version of Bertrand’s paradox . Let a chord be drawn at random in a given circle. What is the probability that the length of the chord will be greater than the side of the equilateral triangle inscribed in that circle?
7.5 . A point is chosen randomly on each of 2 adjacent sides of a square. Find the probability that the area of the triangle formed by the sides of the square and the line joining the 2 points will be (i) less than of the area of the square, (ii) greater than of the area of the square.
Answer
(i) ; (ii) 0.
7.6 . Three points are chosen randomly on the circumference of a circle. What is the probability that there will be a semicircle in which all will lie?
7.7 . A line is divided into 3 subintervals by choosing 2 points randomly on the line. Find the probability that the 3-line segments thus formed could be made to form the sides of a triangle.
Answer
.
7.8 . Find the probability that the roots of the equation will be real if (i) and are randomly chosen between 0 and 1, (ii) is randomly chosen between 0 and 1, and is randomly chosen between -1 and 1.
7.9 . In the interval to , points are chosen randomly. Find (i) the probability that the point lying farthest to the right will be to the right of the number 0.6, (ii) the probability that the point lying farthest to the left will be to the left of the number 0.6, (iii) the probability that the point lying next farthest to the left will be to the right of the number 0.6.
Answer
(i) ; (ii) ; (iii) .
7.10 . A straight line of unit length is divided into 10 subintervals by 9 points chosen at random. For any (i) number , (ii) number find the probability that none of the subintervals will exceed in length.