Probability Theory and Its Applications

A question of great importance in science and engineering is the following: under what conditions can an observed value of a random variable \(X\) be identified with its mean \(E[X]\) ? We have seen in section 5 that if \(X\) is the arithmetic mean of a very large number of independent identically distributed random variables then for any preassigned distance \(\epsilon\) an observed value of \(X\) will, with high probability, be within \(\epsilon\) of \(E[X]\) . In this section we discuss some conditions under which an observed value of a random variable may be identified with its mean.

If \(X\) has finite mean \(E[X]\) and variance \(\sigma^{2}[X]\) , then the condition that an observed value of \(X\) is, with high probability, within a preassigned distance \(\epsilon\) from its mean may be obtained from Chebyshev’s inequality: for any \(\epsilon>0\) \[P[|X-E[X]| \leq \epsilon] \geq 1-\frac{\sigma^{2}[X]}{\epsilon^{2}}. \tag{6.1}\] From (6.1) one obtains these conclusions: \begin{align} P[|X-E[X]| \leq \epsilon] & \geq 95 \% & \text { if } \epsilon \geq 4.5 \sigma[X], \tag{6.2} \\ & \geq 99 \% & \text { if } \epsilon \geq 10 \sigma[X]. \end{align}

If \(X\) may be assumed to be approximately normally distributed, then \[P[|X-E[X]| \leq \epsilon]=\Phi(\epsilon / \sigma[X])-\Phi(-\epsilon / \sigma[X]). \tag{6.3}\] From (6.3) one obtains these conclusions: \begin{align} P[|X-E[X]| \leq \epsilon] & \geq 95 \% & & \text { if } \epsilon \geq 1.96 \sigma[X], \tag{6.4} \\ & \geq 99 \% & & \text { if } \epsilon>2.58 \sigma[X]. \end{align} As a measure of how close the observed value of \(X\) will be to its mean \(E[X]\) , one often uses not the absolute deviation \(|X-E[X]|\) but the relative deviation \[\frac{|X-E[X]|}{|E[X]|}=\left|1-\frac{X}{E[X]}\right|, \tag{6.5}\] assuming that \(E[X] \neq 0\) .

Chebyshev’s inequality may be reformulated in terms of the relative deviation: for any \(\delta>0\) \[P\left[\left|\frac{X-E[X]}{E[X]}\right| \leq \delta\right] \geq 1-\frac{1}{\delta^{2}} \frac{\sigma^{2}[X]}{E^{2}[X]}. \tag{6.6}\] From (6.6) one obtains these conclusions: \begin{align} P\left[\left|\frac{X-E[X]}{E[X]}\right| \leq \delta\right] & \geq 95 \% \quad \text { if } \delta \geq 4.5 \frac{\sigma[X]}{|E[X]|}, \tag{6.7} \\ & \geq 99 \% \quad \text { if } \delta \geq 10 \frac{\sigma[X]}{|E[X]|}. \end{align}

Similarly, if \(X\) is approximately normally distributed,

\begin{align} P\left[\left|\frac{X-E[X]}{E[X]}\right| \leq \delta\right] & \geq 95 \% \quad \text { if } \delta \geq 1.96 \frac{\sigma[X]}{|E[X]|}, \tag{6.8} \\ & \geq 99 \% \quad \text { if } \delta \geq 2.58 \frac{\sigma[X]}{|E[X]|}. \end{align}

From the foregoing inequalities we obtain this basic conclusion for a random variable \(X\) with nonzero mean and finite variance.

In order that the percentage error of \(X\) as an estimate of \(E[X]\) may with high probability be small, it is sufficient that the ratio \[\frac{|E[X]|}{\sigma[X]} \tag{6.9}\] be large . The quantity in (6.9) is called the measurement signal-to-noise ratio ¹of the random variable \(X\) .

How large must the measurement signal-to-noise ratio of a random variable \(X\) be in order that its observed value \(X\) be a good estimate of its mean? By (6.7) and (6.8), various answers to this question can be obtained.

For example, if it is desired that \[P\left[\left|\frac{X-E[X]}{E[X]}\right| \leq 10 \%\right] \geq 95 \%, \tag{6.10}\] then the measurement signal-to-noise ratio must satisfy approximately \[\begin{array}{ll} \dfrac{|E[X]|}{\sigma[X]} \geq 45 & \text { if Chebyshev's inequality applies, } \\ \dfrac{|E[X]|}{\sigma[X]} \geq 20 & \text { if the normal approximation applies. } \tag{6.11} \end{array}\]

The measurement signal-to-noise ratio of various random variables is given in Table 6A. One sees that for most of the random variables given the measurement signal-to-noise ratio is proportional to the square root of some parameter. For example, suppose the number of particles emitted by a radioactive source during a certain time interval is being counted. The number of particles emitted obeys a Poisson probability law with some parameter \(\lambda\) whose value is unknown. If the true value of \(\lambda\) is known to be very large, then the observed number \(X\) of emitted particles is a good estimate of \(\lambda\) , since the measurement signal-to-noise ratio of \(X\) is \(\sqrt{\lambda}\) .

Probability Law of \(X\)	\(E[X]\)	\(\sigma^2[X]\)	\(\left(\dfrac{E[X]}{\sigma[X]}\right)^2\)
Poisson, with parameter \(\lambda > 0\)	\(\lambda\)	\(\lambda\)	\(\lambda\)
Binomial, with parameters \(n\) and \(p\)	\(np\)	\(np(1 - p)\)	\(\frac{np}{1 - p}\)
Geometric, with parameter \(p\)	\(\dfrac{1}{p}\)	\(\dfrac{q}{p^2}\)	\(\dfrac{1}{q}\)
Uniform over the interval \(a\) to \(b\)	\(\dfrac{a + b}{2}\)	\(\dfrac{1}{12}(b - a)^2\)	\(3\left(\dfrac{b + a}{b - a}\right)^2\)
Normal, with parameters \(m\) and \(\sigma\)	\(m\)	\(\sigma^2\)	\(\left(\dfrac{m}{\sigma}\right)^2\)
Exponential, with parameter \(\lambda\)	\(\dfrac{1}{\lambda}\)	\(\dfrac{1}{\lambda^2}\)	\(1\)
\(\chi^2\) , with \(n\) degrees of freedom	\(n\)	\(2n\)	\(\dfrac{n}{2}\)
\(F\) with \(n_1\) , \(n_2\) degrees of freedom	\(\dfrac{n_2}{n_2 - 2}\)	\(\dfrac{2n_2^2(n_1 + n_2 - 2)}{n_1(n_2 - 2)^2(n_2 - 4)}\) if \(n_2 > 4\)	\(\dfrac{n_1(n_2 - 4)}{2(n_1 + n_2 - 2)}\) if \(n_2 > 4\)

TABLE 6A. Measurement Signal-to-Noise Ratio of Random Variables Obeying Various Probability Laws

It is shown in Chapter 10 that many of the random variables in Table 6A are approximately normally distributed in cases in which their measurement signal-to-noise ratio is very large.

Example 6A. The density of an ideal gas . An ideal gas can be regarded as a collection of \(n\) molecules distributed randomly in a volume \(V\) . The density of the gas in a subvolume \(v\) , contained in \(V\) , is a random variable \(d\) given by \(d=N m / v\) , in which \(m\) is the mass of one gas molecule and \(N\) is the number of molecules in the volume \(v\) . Since it is assumed that each of the \(n\) molecules has an independent probability \(v / V\) of being in the subvolume \(v\) , the number \(N\) of molecules in \(v\) obeys a binomial probability law with mean \(E[N]=n v / V\) and variance \(\sigma^{2}[N]=n p q\) , in which we have let \(p=v / V\) and \(q=1-p\) . The density then has mean \(E[d]=n m / V\) . In speaking of the density of gas in the volume \(v\) , the physicist usually has in mind the mean density. The question naturally arises: under what circumstances is the relative deviation \((d-E[d]) / E[d]\) of the true density \(d\) from the mean density \(E[d]\) within a preassigned percentage error \(\delta\) ? More specifically, what values must \(n, m\) , \(v\) , and \(V\) have in order that

\[P\left[\left|\frac{d-E[d]}{E[d]}\right| \leq \delta\right] \geq 1-\eta, \tag{6.12}\]

in which \(\delta\) and \(\eta\) are preassigned positive quantities. By Chebyshev’s inequality

\[P\left[\left|\frac{d-E[d]}{E[d]}\right| \leq \delta\right] \geq 1-\frac{\sigma^{2}[d]}{\delta^{2} E^{2}[d]}=1-\frac{q}{\delta^{2} n p}. \tag{6.13}\]

Consequently, if the quantities \(n, m, v\) , and \(V\) are such that

\[\frac{1-(v / V)}{n(v / V)} \leq \delta^{2} \eta, \tag{6.14}\]

then (6.12) holds. Because of the enormous size of \(n\) (which is of the order \(10^{20}\) per \(\mathrm{cm}^{3}\) ), one would expect (6.14) to be satisfied for \(\eta=\delta=10^{-5}\) , say, as long as \((v / V)\) is not too small. In this case it makes sense to speak of the density of gas in \(v\) , even though the number of molecules in \(v\) is not fixed but fluctuates. However, if \(v / V\) is very small, the fluctuations become sufficiently pronounced, and the ordinary notion of density, which identifies density with mean density, loses its meaning. The “density fluctuations” in small volumes can actually be detected experimentally inasmuch as they cause scattering of sufficiently short wavelengths.

Example 6B. The law of \(\sqrt{n}\) . The physicist Erwin Schrödinger has pointed out in the following statement ( What is Life , Cambridge University Press, 1945, p.16), “…the degree of inaccuracy to be expected in any physical law, the so-called \(\sqrt{n}\) law. The laws of physics and physical chemistry are inaccurate within a probable relative error of the order of \(1 / \sqrt{n}\) , where \(n\) is the number of molecules that cooperate to bring about that law”. From the law of \(\sqrt{n}\) Schrödinger draws the conclusion that in order for the laws of physics and chemistry to be sufficient to explain the laws governing the behavior of living organisms it is necessary that the biologically relevant processes of such an organism involve the cooperation of a very large number of atoms, for only in this case do the laws of physics become exact laws. Since one can show that there are “incredibly small groups of atoms, much too small to display exact statistical laws, which play a dominating role in the very orderly and lawful events within a living organism”, Schrödinger conjectures that it may not be possible to interpret life by the ordinary laws of physics, based on the “statistical mechanism which produces order from disorder”. We state here a mathematical formulation of the law of \(\sqrt{n}\) . If \(X_{1}, X_{2}, \ldots, X_{n}\) are independent random variables identically distributed as a random variable \(X\) , then the sum \(S_{n}=X_{1}+X_{2}+\cdots+X_{n}\) and the sample mean \(M_{n}=S_{n} / n\) have measurement signal-to-noise ratios given by

\[\frac{E\left[S_{n}\right]}{\sigma\left[S_{n}\right]}=\frac{E\left[M_{n}\right]}{\sigma\left[M_{n}\right]}=\sqrt{n} \frac{E[X]}{\sigma[X]}.\]

In words, the sum or average of \(n\) repeated independent measurements of a random variable \(X\) has a measurement signal-to-noise ratio of the order of \(\sqrt{n}\) .

Example 6C. Can the energy of an ideal gas be both constant and a \(\chi^{2}\) distributed random variable? In example 9H of Chapter 7 it is shown that if the state of an ideal gas is a random phenomenon whose probability law is given by Gibbs’s canonical distribution then the energy \(E\) of the gas is a random variable possessing a \(\chi^{2}\) distribution with \(3 N\) degrees of freedom, in which \(N\) is the number of particles comprising the gas. Does this mean that if a gas has constant energy its state as a point in the space of all possible velocities cannot be regarded as obeying Gibbs’s canonical distribution? The answer to this question is no. From a practical point of view there is no contradiction in regarding the energy \(E\) of the gas as being both a constant and a random variable with a \(\chi^{2}\) distribution if the number of degrees of freedom is very large, for then the measurement signal-to-noise ratio of \(E\) (which, from Table 6A, is equal to \((3N/2)^{\frac{1}{2}}\) ) is also very large.

The terminology “signal-to-noise ratio” originated in communications theory. The mean \(E[X]\) of a random variable \(X\) is regarded as a signal that one is attempting to receive (say, at a radio receiver). However, \(X\) is actually received. The difference between the desired value \(E[X]\) and the received value \(X\) is called noise . The less noise present, the better one is able to receive the signal accurately. As a measure of signal strength to noise strength, one takes the signal-to-noise ratio defined by (6.9). The higher the signal-to-noise ratio, the more accurate the observed value \(X\) as an estimate of the desired value \(E[X]\) .

Any time a scientist makes a measurement he is attempting to obtain a signal in the presence of noise or, equivalently, to estimate the mean of a random variable. The skill of the experimental scientist lies in being able to conduct experiments that have a high measurement signal-to-noise ratio. However, there are experimental situations in which this may not be possible. For example, there is an inherent limit on how small one can make the variance of measurements taken with electronic devices. This limit arises from the noise or spontaneous current fluctuations present in such devices (see example 3D of Chapter 6). To measure weak signals in the presence of noise (that is, to measure the mean of a random variable with a small measurement signal-to-noise ratio) one should have a good knowledge of the modern theories of statistical inference.

On the one hand, the scientist and engineer should know statistics in order to interpret best the statistical significance of the data he has obtained. On the other hand, a knowledge of statistics will help the scientist or engineer to solve the basic problem confronting him in taking measurements: given a parameter \(\theta\) , which he wishes to measure, to find random variables \(X_{1}, X_{2}, \ldots, X_{n}\) , whose observed values can be used to form estimates of \(\theta\) that are best according to some criteria.

Measurement signal-to-noise ratios play a basic role in the evaluation of modern electronic apparatus. The reader interested in such questions may consult J. J. Freeman, Principles of Noise , Wiley, New York, 1958, Chapters 7 and 9.

Exercises

6.1. A random variable \(X\) has an unknown mean and known variance 4. How large a random sample should one take if the probability is to be at least 0.95 that the sample mean will not differ from the true mean \(E[X]\) by (i) more than 0.1, (ii) more than \(10 \%\) of the standard deviation of \(X\) , (iii) more than \(10 \%\) of the true mean of \(X\) , if the true mean of \(X\) is known to be greater than 10.

Answer

(i) \(n \geq 1537\) ; (ii) \(n \geq 385\) ; (iii) \(n \geq 16\) .

6.2. Let \(X_{1}, X_{2}, \ldots, X_{n}\) be independent normally distributed random variables with known mean 0 and unknown common variance \(\sigma^{2}\) . Define

\[S_{n}=\frac{1}{n}\left(X_{1}^{2}+X_{2}^{2}+\cdots+X_{n}^{2}\right).\]

Since \(E\left[S_{n}\right]=\sigma^{2}, S_{n}\) might be used as an estimate of \(\sigma^{2}\) . How large should \(n\) be in order to have a measurement signal-to-noise ratio of \(S_{n}\) greater than 20? If the measurement signal-to-noise ratio of \(S_{n}\) is greater than 20, how good is \(S_{n}\) as an estimate of \(\sigma^{2}\) ?

6.3. Consider a gas composed of molecules (with mass of the order of \(10^{-24}\) grams and at room temperature) whose velocities obey the MaxwellBoltzmann law (see exercise 1.15 ). Show that one may assume that all the molecules move with the same velocity, which may be taken as either the mean velocity, the root mean square velocity, or the most probable velocity.

Answer

\(E[v] / \sigma[v] \doteq 10^{\overline{5}}\)

The measurement signal-to-noise ratio of a random variable is the reciprocal of the coefficient of variation of the random variable. (For a definition of the latter, see M. G. Kendall and A. Stuart, The Advanced Theory of Statistics, Griffin, London, 1958, p. 47.) ↩︎