One view that one may take about the nature of probability theory is that it is part of the study of nature in the same way that physics, chemistry, and biology are. Physics, chemistry, and biology may each be defined as the study of certain observable phenomena, which we may call, respectively, the physical, chemical, and biological phenomena. Similarly, one might be tempted to define probability theory as the study of certain observable phenomena, namely the random phenomena. However, a random phenomenon is generally also a phenomenon of some other type; it is a random physical phenomenon, or a random chemical phenomenon, and so on. Consequently, it would seem overly ambitious for researchers in probability theory to take as their province of research all random phenomena. In this book we take the view that probability theory is not directly concerned with the study of random phenomena but rather with the study of the methods of thinking that can be used in the study of random phenomena. More precisely, we make the following definition.
The theory of probability is concerned with the study of those methods of analysis that are common to the study of random phenomena in all the fields in which they arise . Probability theory is thus the study of the study of random phenomena, in the sense that it is concerned with those properties of random phenomena that depend essentially on the notion of randomness and not on any other aspects of the phenomenon considered. More fundamentally, the notions of randomness, of a random phenomenon, of statistical regularity, and of “probability” cannot be said to be obvious or intuitive. Consequently, one of the main aims of a study of the theory of probability is to clarify the meaning of these notions and to provide us with an understanding of them, in much the same way that the study of arithmetic enables us to count concrete objects and the study of electromagnetic wave theory enables us to transmit messages by wireless.
We regard probability theory as a part of mathematics. As is the case with all parts of mathematics, probability theory is constructed by means of the axiomatic method. One begins with certain undefined concepts. One then makes certain statements about the properties possessed by, and the relations between, these concepts. These statements are called the axioms of the theory. Then, by means of logical deduction, without any appeal to experience, various propositions (called theorems ) are obtained from the axioms. Although the propositions do not refer directly to the real world, but are merely logical consequences of the axioms, they do represent conclusions about real phenomena, namely those real phenomena one is willing to assume possess the properties postulated in the axioms.
We are thus led to the notion of a mathematical model of a real phenomenon . A mathematical theory constructed by the axiomatic method is said to be a model of a real phenomenon, if one gives a rule for translating propositions of the mathematical theory into propositions about the real phenomenon. This definition is vague, for it does not state the character of the rules of translation one must employ. However, the foregoing definition is not meant to be a precise one but only to give the reader an intuitive understanding of the notion of a mathematical model. Generally speaking, to use a mathematical theory as a model for a real phenomenon, one needs only to give a rule for identifying the abstract objects about which the axioms of the mathematical theory speak with aspects of the real phenomenon. It is then expected that the theorems of the theory will depict the phenomenon to the same extent that the axioms do, for the theorems are merely logical consequences of the axioms.
As an example of the problem of building models for real phenomena, let us consider the problem of constructing a mathematical theory (or explanation) of the experience recorded in Table 1A , which led us to believe that a long series of trials (of the experiment of drawing a ball from an urn containing six balls, of which four are white and two red) would yield a white ball in approximately \(\frac{2}{3}\) of the trials. In the remainder of this chapter we shall construct a mathematical theory of this phenomenon, which we believe to be a satisfactory model of certain features of it. It may clarify the ideas involved, however, if we consider here an explanation of this phenomenon, which we shall then criticize.
We imagine that we are permitted to label the six balls in the urn with numbers 1 to 6, labeling the four white balls with numbers 1 to 4. When a ball is drawn from the urn, there are six possible outcomes that can be recorded; namely, that ball number 1 was drawn, that ball number 2 was drawn, etc. Now four of these outcomes correspond to the outcome that a white ball is drawn. Therefore the ratio of the number of outcomes of the experiment favorable to a white ball being drawn to the number of all possible outcomes is equal to \(\frac{2}{3}\) . Consequently, in order to “explain” why the observed relative frequency of the drawing of a white ball from the urn is equal to \(\frac{2}{3}\) , one need only adopt this assumption (stated rather informally): the probability of an event (by which is meant the relative frequency with which an event, such as the drawing of a white ball, is observed to occur in a long series of trials of some experiment) is equal to the ratio of the number of outcomes of the experiment in which the event may be observed to the number of all possible outcomes of the experiment.
There are several grounds on which one may criticize the foregoing explanation. First, one may state that it is not mathematical, since it does not possess a structure of axioms and theorems. This defect may perhaps be remedied by using the tools that we develop in the remainder of this chapter; consequently, we shall not press this criticism. However, there is a second defect in the explanation that cannot be repaired. The assumption stated, that the probability of an event is equal to a certain ratio, does not lead to an explanation of the observed phenomenon because by counting in different ways one can obtain different values for the ratio . We have already obtained a value of \(\frac{2}{3}\) for the ratio; we next obtain a value of \(\frac{1}{2}\) . If one argues that there are merely two outcomes (either a white ball or a nonwhite ball is drawn), then exactly one of these outcomes is favorable to a white ball being drawn. Therefore, the ratio of the number of outcomes favorable to a white ball being drawn to the number of possible outcomes is \(\frac{1}{2}\) .
We now proceed to develop the mathematical tools we require to construct satisfactory models of random phenomena.