In this section we indicate briefly how one may derive the Poisson probability law, and various related probability laws, by means of differential equations. The process to be examined is treated in the literature of stochastic processes under the name “birth and death”.
Consider a population, such as the molecules present in a certain sub-volume of gas, the particles emitted by a radioactive source, biological organisms of a certain kind present in a certain environment, persons waiting in a line (queue) for service, and so on. Let \(X_{t}\) be the size of the population at a given time \(t\) . The probability law of \(X_{t}\) is specified by its probability mass function,
\[p(n ; t)=P\left[X_{t}=n\right] \quad n=0,1,2, \cdots. \tag{5.1}\]
A differential equation for the probability mass function of \(X_{t}\) may be found under assumptions similar in spirit to, but somewhat more general than, those made in deriving (3.1). In reading the following discussion the reader should attempt to formulate explicitly for himself the assumptions that are being made. A rigorous treatment of this discussion is given by W. Feller, An Introduction to Probability Theory and its Applications , Wiley, 1957, pp. 397–411.
Let \(r_{0}(h), r_{1}(h)\) , and \(r_{2}(h)\) be functions defined for \(h>0\) with the property that
\[\lim _{h \rightarrow 0} \frac{r_{0}(h)}{h}=\lim _{h \rightarrow 0} \frac{r_{1}(h)}{h}=\lim _{h \rightarrow 0} \frac{r_{2}(h)}{h}=0.\]
Assume that the probability is \(r_{2}(h)\) that in the time from \(t\) to \(t+h\) the population size will change by two or more. For \(n \geq 1\) the event that \(X_{t+h}=n\) ( \(n\) members in the population at time \(t+h\) ) can then essentially happen in any one of three mutually exclusive ways: (i) the population size at time \(t\) is \(n\) and undergoes no change in the time from \(t\) to \(t+h\) ; (ii) the population size at time \(t\) is \(n-1\) and increases by one in the time from \(t\) to \(t+h\) ; (iii) the population size at time \(t\) is \(n+1\) and decreases by one in the time from \(t\) to \(t+h\) . For \(n=0\) , the event that \(X_{t+h}=0\) can happen only in ways (i) and (iii). Now let us introduce quantities \(\lambda_{n}\) and \(\mu_{n}\) , defined as follows; \(\lambda_{n} h+r_{1}(h)\) for any time \(t\) and positive value of \(h\) is the conditional probability that the population size will increase by one in the time from \(t\) to \(t+h\) , given that the population had size \(n\) at time \(t\) , whereas \(\mu_{n} h+r_{0}(h)\) is the conditional probability that the population size will decrease by one in the time from \(t\) to \(t+h\) , given that the population had size \(n\) at time \(t\) . In symbols, \(\lambda_{n}\) and \(\mu_{n}\) are such that, for any time \(t\) and small \(h>0\) ,
\[\begin{array}{rlrl} \lambda_{n} h & \doteq P\left[X_{t+h}-X_{t}=1 \mid X_{t}=n\right], & & n \geq 0 \\ \mu_{n} h & \doteq P\left[X_{t+h}-X_{t}=-1 \mid X_{t}=n\right], & & n \geq 1; \tag{5.2} \end{array}\]
the approximation in (5.2) is such that the difference between the two sides of each equation tends to 0 faster than \(h\) , as \(h\) tends to 0. In writing the next equations we omit terms that tend to 0 faster than \(h\) , as \(h\) tends to 0, since these terms vanish in deriving the differential equations in (5.10) and (5.11). The reader may wish to verify this statement for himself.
The event (i) then has probability,
\[p(n ; t)\left(1-\lambda_{n} h-\mu_{n} h\right); \tag{5.3}\]
the event (ii) has probability
\[p(n-1 ; t) \lambda_{n-1} h; \tag{5.4}\]
the event (iii) has probability
\[p(n+1 ; t) \mu_{n+1} h. \tag{5.5}\]
Consequently, one obtains for \(n \geq 1\)
\begin{align} p(n ; t+h)=p(n ; t)(1- & \left.\lambda_{n} h-\mu_{n} h\right) \tag{5.6} \\ & +p(n-1 ; t) \lambda_{n-1} h+p(n+1 ; t) \mu_{n+1} h \end{align}
For \(n=0\) one obtains
\[p(0 ; t+h)=p(0 ; t)\left(1-\lambda_{0} h\right)+p(1 ; t) \mu_{1} h. \tag{5.7}\]
It may be noted that if there is a maximum possible value \(N\) for the population size then (5.6) holds only for \(1 \leq n \leq N-1\) , whereas for \(n=N\) one obtains
\[p(N ; t)=p(N ; t)\left(1-\mu_{N} h\right)+p(N-1 ; t) \lambda_{N-1} h. \tag{5.8}\]
Rearranging (5.6), one obtains
\begin{align} \frac{p(n ; t+h)-p(n ; t)}{h}=- & \left(\lambda_{n}+\mu_{n}\right) p(n ; t) \tag{5.9} \\ & +\lambda_{n-1} p(n-1 ; t)+\mu_{n+1} p(n+1 ; t). \end{align}
Letting \(h\) tend to 0, one finally obtains for \(n \geq 1\)
\begin{align} & \frac{\partial}{\partial t} p(n ; t)=-\left(\lambda_{n}+\mu_{n}\right) p(n ; t) \tag{5.10} \\ & \quad+\lambda_{n-1} p(n-1 ; t)+\mu_{n+1} p(n+1 ; t). \end{align}
Similarly, for \(n=0\) one obtains
\[\frac{\partial}{\partial t} p(0 ; t)=-\lambda_{0} p(0 ; t)+\mu_{1} p(1 ; t). \tag{5.11}\]
The question of the existence and uniqueness of solutions of these equations is nontrivial and is not discussed here.
We solve these equations only in the case that
\begin{align} & \lambda_{0}=\lambda_{1}=\lambda_{2}=\cdots=\lambda_{n}=\cdots=\lambda \tag{5.12} \\ & \mu_{1}=\mu_{2}=\mu_{3}=\cdots=\mu_{n}=\cdots=0, \end{align}
which corresponds to the assumptions made before (3.1). Then (5.11) becomes
\[\frac{\partial}{\partial t} p(0 ; t)=-\lambda p(0 ; t), \tag{5.13}\]
which has solution (under the assumption \(p(0 ; 0)=0\) )
\[p(0 ; t)=e^{-\lambda t}. \tag{5.14}\]
Next (5.10) for the case \(n=1\) becomes
\[\frac{\partial}{\partial t} p(1 ; t)=-\lambda p(1 ; t)+\lambda p(0 ; t), \tag{5.15}\]
which has solution (under the assumption \(p(1 ; 0)=0\) )
\begin{align} p(1 ; t) & =\lambda e^{-\lambda t} \int_{0}^{t} e^{\lambda t^{\prime}} p\left(0 ; t^{\prime}\right) d t^{\prime} =\lambda t e^{-\lambda t}. \tag{5.16} \end{align}
Proceeding inductively, one obtains (assuming \(p(n ; 0)=0\) )
\[p(n ; t)=\frac{(\lambda t)^{n}}{n !} e^{-\lambda t}, \tag{5.17}\]
so that the size \(X_{t}\) of the population at time \(t\) obeys a Poisson probability law with mean \(\lambda t\) .
Theoretical Exercises
5.1. The Yule process. Consider a population whose numbers can (by splitting or otherwise) give birth to new members but cannot die. Assume that the probability is approximately equal to \(\lambda h\) that in a short time interval of length \(h\) a member will create a new member. More precisely, in the model of section 5, assume that \[\lambda_{n}=n \lambda, \quad \mu_{n}=0.\] If at time 0 the population size is \(k\) , show that the probability that the population size at time \(t\) is equal to \(n\) is given by \[p(n ; t)=\left(\begin{array}{l} n-1 \tag{5.18} \\ n-k \end{array}\right) e^{-k \lambda t}\left(1-e^{-\lambda t}\right)^{n-k}, \quad n \geq k.\] Show that the probability law defined by (5.18) has mean \(m\) and variance \(\sigma^{2}\) given by \[m=k e^{\lambda t}, \quad \sigma^{2}=k e^{\lambda t}\left(e^{\lambda t}-1\right). \tag{5.19}\]