To introduce the notion of a numerical-valued random phenomenon, let us first consider a random phenomenon whose sample description space \(S\) is a set of real numbers; for example, the number of white balls in a sample of size \(n\) drawn from an urn or the number of hits in \(n\) independent throws of a dart. For the sample description space of each of these random phenomena one may take the set \(\{0,1,2, \ldots, n\}\) . However, it has already been indicated (in section 3 of Chapter 1) that one may make the sample description space \(S\) as large as one pleases, at the price of having a large number of sample descriptions in \(S\) to which zero probability is assigned. Consequently, we may take for the sample description space of these phenomena the set of all real numbers from \(-\infty\) to \(\infty\) . The advantage of this procedure might be that it would render possible a unified theory of random phenomena whose sample description spaces are sets of real numbers.
There is still another advantage. Suppose one is measuring the weight of persons belonging to a certain group. One may measure the weight to the nearest pound, the nearest tenth of a pound, or the nearest hundredth of a pound. In the first case the space \(S=\left\{\text{real numbers }x:\ x=k \text{ for some integer } k=0,1,2, \ldots, 10^{4}\right\}\) would suffice as the sample description space; in the second case \(S=\) real numbers \(x: x=k / 10\) for some integer \(\left.k=0,1,2, \ldots, 10^{5}\right\}\) would suffice; in the third case \(S=\left\{\text{real numbers }x:\ x=k / 100\ \text{ for some integer } k=0,1,2, \ldots, 10^{6}\right\}\) would suffice. Nevertheless, it might be preferable in all three cases to take as one’s sample description space the set of all numbers from \(-\infty\) to \(\infty\) and to develop the difference between the three cases in terms of the different probability functions adopted to describe the three random phenomena.
We are thus led to define the notion of a numerical-valued random phenomenon as a random phenomenon whose sample description space is the set \(R\) , consisting of all real numbers from \(-\infty\) to \(\infty\) . The set \(R\) may be represented geometrically by a real line , which is an infinitely long line on which an origin and a unit distance have been marked off; then to every point on the line there corresponds a real number and to every real number there corresponds a point on the line.
We have previously defined an event as a set of sample descriptions; consequently, events defined on numerical-valued random phenomena are sets of real numbers . However, not every set of real numbers can be regarded as an event. There are certain sets of real numbers, defined by exceedingly involved limiting operations, that are nonprobabilizable, in the sense that for these sets it is not in general possible to answer, in a manner consistent with the axioms below, the question, “what is the probability that a given numerical-valued random phenomenon will have an observed value in the set”? Consequently, by the word “event” we mean not any set of real numbers but only a probabilizable set of real numbers. We do not possess at this stage in our discussion the notions with which to characterize the sets of real numbers that are probabilizable. We can point out only that it may be shown that the family (call it \(\mathscr{F}\) ) of probabilizable sets always has the following properties:
- To \(\mathscr{F}\) belongs any interval (an interval is a set of real numbers of the form \(\{x: a
, or \(\{x: a \leq x \leq b\}\) , in which \(a\) and \(b\) may be finite or infinite numbers). - To \(\mathscr{F}\) belongs the complement \(A^{c}\) of any set \(A\) belonging to \(\mathscr{F}\) .
- To \(\mathscr{F}\) belongs the union \(\bigcup^{\infty}_{n=1} A_{n}\) of any sequence of sets \(A_{1}, A_{2}, \ldots\) , \(A_{n}, \ldots\) belonging to \(\mathscr{F}\) .
If we desire give a precise definition of the notion of an event at this stage in our discussion, we may do so as follows. There exists a smallest family of sets on the real line with the properties (i), (ii), and (iii). This family is denoted by \(\mathscr{B}\) , and any member of \(\mathscr{B}\) is called a Borel set, after the great French mathematician and probabilist Émile Borel. Since \(\mathscr{B}\) is the smallest family to possess properties (i), (ii), and (iii), it follows that \(\mathscr{B}\) is contained in \(\mathscr{F}\) , the family of probabilizable sets. Thus every Borel set is probabilizable. Since the needs of mathematical rigor are fully met by restricting our discussion to Borel sets, in this book, by an “ event ” concerning a numerical-valued random phenomena, we mean a Borel set of real numbers .
We sum up the discussion of this section in a formal definition.
A numerical-valued random phenomenon is a random phenomenon whose sample description space is the set \(R\) (of all real numbers from \(-\infty\) to \(\infty\) ) on whose subsets is defined a function \(P\left[\cdot\right]\) , which to every Borel set of real numbers (also called an event) \(E\) assigns a nonnegative real number, denoted by \(P[E]\) , according to the following axioms:
- Axiom 1. \(P[E] \geq 0\) for every event \(E\) .
- Axiom 2. \(P[R]=1\) .
- Axiom 3. For any sequence of events \(E_{1}, E_{2}, \ldots, E_{n}, \ldots\) which is mutually exclusive, \[P\left[\bigcup_{n=1}^{\infty} E_{n}\right]=\sum_{n=1}^{\infty} P\left[E_{n}\right].\]
Example 1A . Consider the random phenomenon that consists in observing the time one has to wait for a bus at a certain downtown bus stop. Let \(A\) be the event that one has to wait between 0 and 2 minutes, inclusive, and let \(B\) be the event that one has to wait between 1 and 3 minutes, inclusive. Assume that \(P[A]=\frac{1}{2}, P[B]=\frac{1}{2}, P[A B]=\frac{1}{3}\) . We can now answer all the usual questions about the events \(A\) and \(B\) . The conditional probability \(P[B \mid A]\) that \(B\) has occurred given that \(A\) has occurred is \(\frac{2}{3}\) . The probability that neither the event \(A\) nor the event \(B\) has occurred is given by \(P\left[A^{c} B^{c}\right]=1-P[A \cup B]=1-P[A]-P[B]+P[A B]=\frac{1}{3}\) .
Exercise
1.1. Consider the events \(A\) and \(B\) defined in example 1A . Assuming that \(P[A]=P[B]=\frac{1}{2}, P[A B]=\frac{1}{3}\) , find the probability for \(k=0,1,2\) , that (i) exactly \(k\) , (ii) at least \(k\) , (iii) no more than \(k\) of the events \(A\) and \(B\) will occur.
Answer
\begin{align} & P[\text{exactly}\;0]=\frac{1}{3}.\quad P[\text{exactly}\;1]=\frac{1}{3}.\quad P[\text{exactly}\;2]=\frac{1}{3}.\quad P[\text{at least}\;0]=1. \\ & P[\text{at least}\;1]=\frac{2}{3}.\quad P[\text{at least}\;2]=\frac{1}{3}.\quad P[\text{at most}\;0]=\frac{1}{3}.\quad P[\text{at most}\;1]=\frac{2}{3}. \\ & P[\text{at most}\;2]=1. \end{align}