Probability Theory and Its Applications

The notion of the probability law of a random phenomenon is introduced in this section in order to provide a concise and intuitively meaningful language for describing the probability properties of a random phenomenon.

In order to describe a numerical valued random phenomenon, it is necessary and sufficient to state its probability function \(P[\cdot]\) ; this is equivalent to stating for any Borel set \(B\) of real numbers the probability that an observed value of the random phenomenon will be in the Borel set \(B\) . However, other functions exist, a knowledge of which is equivalent to a knowledge of the probability function. The distribution function is one such function, for between probability functions and distribution functions there is a one-to-one correspondence. Similarly, between discrete distribution functions and probability mass functions and between continuous distribution functions and probability density functions one to-one correspondences exist. Thus we have available different, but equivalent, representations of the same mathematical concept, which we may call the probability law (or sometimes the probability distribution) of the numerical valued random phenomenon .

A probability law is called discrete if it corresponds to a discrete distribution function and continuous if it corresponds to a continuous distribution function.

For example, suppose one is considering the numerical valued random phenomenon that consists in observing the number of hits in five independent tosses of a dart at a target, where the probability at each toss of hitting the target is some constant \(p\) . To describe the phenomenon, one needs to know, by definition, the probability function \(P[\cdot]\) , which for any set \(E\) of real numbers is given by

\[P[E]=\sum_{k \text { in } E\{0,1, \ldots, 5\}}\left(\begin{array}{l} 5 \tag{4.1} \\ k \end{array}\right) p^{k} q^{5-k}.\]

It should be recalled that \(E\{0,1, \ldots, 5\}\) represents the intersection of the sets \(E\) and \(\{0,1, \ldots, 5\}\) .

Equivalently, one may describe the phenomenon by stating its distribution function \(F(\cdot)\) ; this is done by giving the value of \(F(x)\) at any real number \(x\) , \[F(x)=\sum_{k=0}^{[x]}\left(\begin{array}{l} 5 \tag{4.2} \\ k \end{array}\right) p^{k} q^{5-k}.\] It should be recalled that \([x]\) denotes the largest integer less than or equal to \(x\) .

Equivalently, since the distribution function is discrete, one may describe the phenomenon by stating its probability mass function \(p(\cdot)\) , given by \begin{align} p(x) = \begin{cases} \displaystyle \binom{5}{x} p^{x} q^{5-x}, & \text{for } x = 0, 1, \ldots, 5 \\[2mm] 0, & \text{otherwise}. \end{cases} \tag{4.3} \end{align}

Equations (4.1) , (4.2) , and (4.3) constitute equivalent representations, or statements, of the same concept, which we call the probability law of the random phenomenon. This particular probability law is discrete.

We next note that probability laws may be classified into families on the basis of similar functional form . For example, consider the function \(b(\cdot ; n, p)\) defined for any \(n=1,2, \ldots\) and \(0 \leq p \leq 1\) by \begin{align} b(x; n, p) = \begin{cases} \displaystyle \binom{n}{x} p^{x} q^{n-x}, & \text{for } x = 0, 1, \ldots, n \\[2mm] 0, & \text{otherwise.} \end{cases} \end{align}

For fixed values of \(n\) and \(p\) the function \(b(\cdot ; n, p)\) is a probability mass function and thus defines a probability law. The probability laws determined by \(b\left(\cdot ; n_{1}, p_{1}\right)\) and \(b\left(\cdot ; n_{2}, p_{2}\right)\) for two different sets of values \(n_{1}, p_{1}\) and \(n_{2}, p_{2}\) are different. Nevertheless, the common functional form of the two functions \(b\left(\cdot ; n_{1}, p_{1}\right)\) and \(b\left(\cdot ; n_{2}, p_{2}\right)\) enables us to treat simultaneously the two probability laws that they determine. We call \(n\) and \(p\) parameters, and \(b(\cdot ; n, p)\) the probability mass function of the binomial probability law with parameters \(n\) and \(p\) .

We next list some frequently occurring discrete probability laws, to be followed by a list of some frequently occurring continuous probability laws.

The Bernoulli probability law with parameter \(p\) , where \(0 \leq p \leq 1\) , is specified by the probability mass function \begin{align} p(x) = \begin{cases} p, & \text{if } x = 1 \\[2mm] 1 - p = q, & \text{if } x = 0 \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.4} \end{align}

An example of a numerical valued random phenomena obeying the Bernoulli probability law with parameter \(p\) is the outcome of a Bernoulli trial in which the probability of success is \(p\) , if instead of denoting success and failure by \(s\) and \(f\) , we denote them by 1 and 0, respectively.

The binomial probability law with parameters \(n\) and \(p\) , where \(n=1\) , \(2, \ldots\) , and \(0 \leq p \leq 1\) , is specified by the probability mass function \begin{align} p(x) = \begin{cases} \displaystyle \binom{n}{x} p^{x} (1-p)^{n-x}, & \text{for } x = 0, 1, \ldots, n \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.5} \end{align}

An important example of a numerical valued random phenomenon obeying the binomial probability law with parameters \(n\) and \(p\) is the number of successes in \(n\) independent repeated Bernoulli trials in which the probability of success at each trial is \(p\) .

The Poisson probability law with parameter \(\lambda\) , where \(\lambda>0\) , is specified by the probability mass function \begin{align} p(x) = \begin{cases} e^{-\lambda} \frac{\lambda^{x}}{x!}, & \text{for } x = 0, 1, 2, \ldots \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.6} \end{align}

In section 3 of Chapter 3 it was seen that the Poisson probability law provides under certain conditions an approximation to the binomial probability law. In section 3 of Chapter 6 we discuss random phenomena that obey the Poisson probability law.

The geometric probability law with parameter \(p\) , where \(0 \leq p \leq 1\) , is specified by the probability mass function \begin{align} p(x) = \begin{cases} p(1-p)^{x-1}, & \text{for } x = 1, 2, \ldots \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.7} \end{align}

An important example of a numerical valued random phenomenon obeying the geometric probability law with parameter \(p\) is the number of trials required to obtain the first success in a sequence of independent repeated Bernoulli trials in which the probability of success at each trial is \(p\) .

The hypergeometric probability law with parameters \(N, n\) , and \(p\) (where \(N\) may be any integer \(1,2, \ldots, n\) is an integer in the set \(1,2, \ldots, N\) and \(p=0,1 / N, 2 / N, \ldots, 1)\) is specified by the probability mass function, letting \(q=1-p\) , \begin{align} p(x) = \begin{cases} \frac{\displaystyle \binom{Np}{x} \binom{Nq}{n-x}}{\displaystyle \binom{N}{n}}, & \text{for } x = 0, 1, \ldots, n \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.8} \end{align}

The hyper geometric probability law may also be defined by using (2.31) , for any value of \(p\) in the interval \(0 \leq p \leq 1\) . An example of a random phenomenon obeying the hyper geometric probability law is given by the number of white balls contained in a sample of size \(n\) drawn without replacement from an urn containing \(N\) balls, of which \(N p\) are white.

The negative binomial probability law with parameters \(r\) and \(p\) , where \(r=1,2, \ldots\) and \(0 \leq p \leq 1\) , is specified by the probability mass function, letting \(q=1-p\) , \begin{align} p(x) = \begin{cases} \displaystyle \binom{r+x-1}{x} p^{r} q^{x} = \binom{-r}{x} p^{r} (-q)^{x}, & \text{for } x = 0, 1, \ldots \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.9} \end{align}

An example of a random phenomenon obeying the negative binomial probability law with parameters \(r\) and \(p\) is the number of failures encountered in a sequence of independent repeated Bernoulli trials (with probability \(p\) of success at each trial) before the \(r\) th success. Note that the number of trials required to achieve the \(r\) th success is equal to \(r\) plus the number of failures encountered before the \(r\) th success is met.

Some important continuous probability laws are the following.

The uniform probability law over the interval \(a\) to \(b\) , where \(a\) and \(b\) are any finite real numbers such that \(a, is specified by the probability density function \[f(x) = \left\{\begin{aligned} &\frac{1}{b-a}, && \text{for } a < x < b \\[2mm] &0, && \text{otherwise.} \end{aligned}\right. \tag{4.10}\] Examples of random phenomena obeying a uniform probability-law are discussed in section 5.

The normal probability law with parameters \(m\) and \(\sigma\) , where \(-\infty<\) \(m<\infty\) and \(\sigma>0\) , is specified by the probability density function \[f(x)=\frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2}\left(\frac{x-m}{\sigma}\right)^{2}}, \quad-\infty

The role played by the normal probability law in probability theory is discussed in Chapter 6 . In section 6 we introduce certain functions that are helpful in the study of the normal probability law.

The exponential probability law with parameter \(\lambda\) , in which \(\lambda>0\) , is specified by the probability density function \begin{align} f(x) = \begin{cases} \lambda e^{-\lambda x}, & \text{for } x > 0 \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.12} \end{align}

The gamma probability law with parameters \(r\) and \(\lambda\) , in which \(r=1,2, \ldots\) and \(\lambda>0\) , is specified by the probability density function

\begin{align} f(x) = \begin{cases} \dfrac{\lambda}{(r-1)!} (\lambda x)^{r-1} e^{-\lambda x}, & \text{for } x \geq 0 \\[2mm] 0, & \text{otherwise.} \end{cases} \tag{4.13} \end{align}

The exponential and gamma probability laws are discussed in Chapter 6 .

The Cauchy probability law with parameters \(\alpha\) and \(\beta\) ; in which \(-\infty<\) \(\alpha<\infty\) and \(\beta>0\) , is specified by the probability density function \[f(x)=\frac{1}{\pi \beta\left\{1+\left(\frac{x-\alpha}{\beta}\right)^{2}\right\}}, \quad-\infty

Student’s distribution with parameter \(n=1,2, \ldots\) (also called Student’s \(t\) -distribution with \(n\) degrees of freedom) is specified by the probability density function \[f(x)=\frac{1}{\sqrt{n \pi}} \frac{\Gamma[(n+1) / 2]}{\Gamma(n / 2)}\left(1+\frac{x^{2}}{n}\right)^{-(n+1) / 2} \tag{4.15}\]

It should be noted that Student’s distribution with parameter \(n=1\) coincides with the Cauchy probability law with parameters \(\alpha=0\) and \(\beta=1\) .

The \(\chi^{2}\) distribution with parameters \(n=1,2, \ldots\) and \(\sigma>0\) is specified by the probability density function \begin{align} f(x) = \begin{cases} \frac{1}{2^{n/2} \sigma^{n} \Gamma(n/2)} x^{(n/2)-1} e^{-\left(x / 2 \sigma^{2}\right)}, & \text{for } x > 0 \\[2mm] 0, & \text{for } x < 0 \end{cases} \tag{4.16} \end{align}

The symbol \(\chi\) is the Greek letter chi, and one sometimes writes chi-square for \(\chi^{2}\) . The \(\chi^{2}\) distribution with parameters \(n\) and \(\sigma=1\) is called in statistics the \(\chi^{2}\) distribution with \(n\) degrees of freedom. The \(\chi^{2}\) distribution with parameters \(n\) and \(\sigma\) coincides with the gamma distribution with parameters \(r=n / 2\) and \(\lambda=1 /\left(2 \sigma^{2}\right)\) [to define the gamma probability law for non-integer \(r\) , replace \((r-1)\) ! in (4.13) by \(\Gamma(r)]\) .

The \(\chi\) distribution with parameters \(n=1,2, \ldots\) and \(\sigma>0\) is specified by the probability density function \begin{align} f(x) = \begin{cases} \frac{2(n/2)^{n/2}}{\sigma^{n} \Gamma(n/2)} x^{n-1} e^{-\left(\frac{n}{2} \sigma^{2}\right) x^{2}}, & \text{for } x > 0 \\[2mm] 0, & \text{for } x < 0. \end{cases} \tag{4.17} \end{align}

The \(\chi\) distribution with parameters \(n\) and \(\sigma=1\) is often called the chi distribution with \(n\) degrees of freedom. (The relation between the \(\chi^{2}\) and \(\chi\) distributions is given in exercise 8.1 of Chapter 7 ).

The Rayleigh distribution with parameter \(\alpha>0\) is specified by the probability density function \begin{align} f(x) = \begin{cases} \frac{1}{\alpha^{2}} x e^{-\frac{1}{2} \left(\frac{x}{\alpha}\right)^{2}}, & \text{for } x > 0 \\[2mm] 0, & \text{for } x < 0. \end{cases} \tag{4.18} \end{align} The Rayleigh distribution coincides with the \(\chi\) distribution with parameters \(n=2\) and \(\sigma=\alpha \sqrt{2}\) .

The Maxwell distribution with parameter \(\alpha>0\) is specified by the probability density function \begin{align} f(x) = \begin{cases} \frac{4}{\sqrt{\pi}} \frac{1}{\alpha^{3}} x^{2} e^{-\frac{x^{2}}{\alpha^{2}}}, & \text{for } x > 0 \\[2mm] 0, & \text{for } x < 0. \end{cases} \tag{4.4.19} \end{align}

The Maxwell distribution with parameter \(\alpha\) coincides with the \(\chi\) distribution with parameter \(n=3\) and \(\sigma=\alpha \sqrt{3 / 2}\) .

The \(F\) distribution with parameters \(m=1,2, \ldots\) and \(n=1,2, \ldots\) is specified by the probability density function \begin{align} f(x) &= \begin{cases} \displaystyle \frac{\Gamma\left(\dfrac{m+n}{2}\right)}{\Gamma\left(\dfrac{m}{2}\right) \Gamma\left(\dfrac{n}{2}\right)} \left(\dfrac{m}{n}\right)^{m / 2} \frac{x^{(m / 2)-1}}{\left[1+\left(\dfrac{m}{n}\right) x\right]^{(m+n) / 2}} & \text{for } x > 0, \tag{4.20} \\ 0 & \text{for } x < 0. \end{cases} \end{align}

The beta probability law with parameters \(a\) and \(b\) , in which \(a\) and \(b\) are positive real numbers, is specified by the probability density function \begin{align} f(x) = \begin{cases} \displaystyle \frac{1}{B(a, b)} x^{a-1}(1-x)^{b-1}, & \text{for } 0 < x < 1 \\[2mm] 0, & \text{elsewhere.} \end{cases} \tag{4.21} \end{align}

Theoretical Exercises

4.1 . The probability law of the number of white balls in a sample drawn without replacement from an urn of random composition . Consider an urn containing \(N\) balls. Suppose that the number of white balls in the urn is a numerical valued random phenomenon obeying (i) a binomial probability law with parameters \(N\) and \(p\) , (ii) a hyper geometric probability law with parameters \(M, N\) , and \(p\) . [For example, suppose that the balls in the urn constitute a sample of size \(N\) drawn with replacement (without replacement) from a box containing \(M\) balls, of which a proportion \(p\) is white.] Let a sample of size \(n\) be drawn without replacement from the urn. Show that the number of white balls in the sample obeys either a binomial probability law with parameters \(n\) and \(p\) , or a hyper geometric probability law with parameters \(M, n\) , and \(p\) , depending on whether the number of white balls in the urn obeys a binomial or a hyper geometric probability law.

Hint : Establish the conditions under which the following statements are valid: \begin{align} \left(\begin{array}{l} N \\ m \end{array}\right) & =\left(\begin{array}{l} N-k \\ m-k \end{array}\right) \frac{(N)_{k}}{(m)_{k}} \\ \frac{\left(\begin{array}{l} m \\ k \end{array}\right)\left(\begin{array}{l} N-m \\ n-k \end{array}\right)}{\left(\begin{array}{l} N \\ n \end{array}\right)} & =\frac{\left(\begin{array}{l} n \\ k \end{array}\right)\left(\begin{array}{l} N-n \\ m-k \end{array}\right)}{\left(\begin{array}{l} N \\ m \end{array}\right)} ; \\ \sum_{m=0}^{N} \frac{\left(\begin{array}{c} m \\ k \end{array}\right)\left(\begin{array}{l} N-m \\ n-k \end{array}\right)}{\left(\begin{array}{l} N \\ n \end{array}\right)} p(m) & =\sum_{m=k}^{N-n+k} \frac{\left(\begin{array}{l} n \\ k \end{array}\right)\left(\begin{array}{l} N-n \\ m-k \end{array}\right)}{\left(\begin{array}{l} N \\ m \end{array}\right)} p(m) \end{align} where

\begin{align} & p(m)=\left(\begin{array}{l} N \\ m \end{array}\right) p^{m} q^{N-m}, \\ & p(m)=\frac{\left(\begin{array}{c} M p \\ m \end{array}\right)\left(\begin{array}{c} M q \\ N-m \end{array}\right)}{\left(\begin{array}{c} M \\ N \end{array}\right)}=\frac{\left(\begin{array}{c} N \\ m \end{array}\right)\left(\begin{array}{c} M-N \\ M p-m \end{array}\right)}{\left(\begin{array}{c} M \\ M p \end{array}\right)}. \end{align} Finally, use the fact that \[\frac{\left(\begin{array}{c} M p \\ k \end{array}\right)\left(\begin{array}{c} M q \\ n-k \end{array}\right)}{\left(\begin{array}{c} M \\ n \end{array}\right)}=\frac{\left(\begin{array}{c} n \\ k \end{array}\right)\left(\begin{array}{c} M-n \\ M p-k \end{array}\right)}{\left(\begin{array}{c} M \\ M p \end{array}\right)}.\]

Exercises

4.1 . Give formulas for, and identify, the probability law of each of the following numerical valued random phenomena:

(i) The number of defectives in a sample of size 20, chosen without replacement from a batch of 200 articles, of which \(5 \%\) are defective.

(ii) The number of baby boys in a series of 30 independent births, assuming the probability at each birth that a boy will be born is 0.51.

(iii) The minimum number of babies a woman must have in order to give birth to a boy (ignore multiple births, assume independence, and assume the probability at each birth that a boy will be born is 0.51 ).

(iv) The number of patients in a group of 35 having a certain disease who will recover if the long-run frequency of recovery from this disease is \(75 \%\) (assume that each patient has an independent chance to recover).

In exercises 4.2–4.9 consider an urn containing 12 balls, numbered 1 to 12. Further, the balls numbered 1 to 8 are white, and the remaining balls are red. Give a formula for the probability law of the numerical valued random phenomenon described.

Answer

(i) Hypergeometric with parameters \(N=200, n=20, p=0.05\) ; (ii) binomial with parameters \(n=30, p=0.51\) ; (iii) geometric with parameter \(p=0.51\) ; (iv) binomial with parameters \(n=35, p=0.75\) .

4.2 . The number of white balls in a sample of size 6 drawn from the urn without replacement.

4.3 . The number of white balls in a sample of size 6 drawn from the urn with replacement.

Answer

\(p(x)=\left(\begin{array}{l}6 \\ x\end{array}\right)\left(\begin{array}{c}2 \\ 3\end{array}\right)^{x}\left(\frac{1}{3}\right)^{6-x} \quad\) for \(x=0,1, \ldots, 6; 0\) otherwise.

4.4 . The smallest number occurring on the balls in a sample of size 6, drawn from the urn without replacement (see theoretical exercise 5.1 of Chapter 2.)

4.5 . The second smallest number occurring in a sample of size 6, drawn from the urn without replacement.

Answer

\(p(x)=(x-1)\left(\begin{array}{c}12-x \\ 4\end{array}\right) /\left(\begin{array}{c}12 \\ 6\end{array}\right) \quad\) for \(x=2, \ldots, 12; 0\) otherwise.

4.6 . The minimum number of balls that must be drawn, when sampling without replacement, to obtain a white ball.

4.7 . The minimum number of balls that must be drawn, when sampling with replacement, to obtain a white ball.

Answer

\(p(x)=\left(\frac{2}{3}\right)\left(\frac{1}{3}\right)^{x-1} \quad\) for \(x=1,2, \ldots ; 0\) otherwise.

4.8 . The minimum number of balls that must be drawn, when sampling without replacement, to obtain 2 white balls.

4.9 . The minimum number of balls that must be drawn, when sampling with replacement, to obtain 2 white balls.

Answer

\(p(x)=(x-1)\left(\frac{2}{3}\right)^{2}\left(\frac{1}{3}\right)^{x-2} \quad\) for \(x=2,3, \ldots ; 0\) otherwise.