These notes are based on Chapter 2 of Robert G. Gallager (2013). Stochastic Processes: Theory for Applications. Cambridge University Press.

The Poisson process is, in many ways, a continuous-time analogue of the Bernoulli process, a sequence of IID Bernoulli trials, which we use to motivate Binomial, Geometric, and Negative binomial.

Here is a summary of correspondences between discrete-time and continuous-time analogues in different modelling contexts.

Context Discrete time Continuous time
Number of arrivals Binomial Poisson
Time to an arrival Geometric Exponential
Time to multiple arrivals Negative binomial Gamma
Arrival process Bernoulli process Poisson process

Let’s first define a couple of key concepts and review the Bernoulli process to motivate the Poisson process.

Arrival process

An arrival process is a sequence of increasing random variables \(0 < S_1 < S_2 < \dots\), where \(S_1, S_2,\dots\) are called arrival epochs and represent successive times at which some random phenomenon occurs.

In these notes, epoch is used as a synonym for time to avoid possible confusion.

Any arrival process \(\{S_n : n\geq 1\}\) can also be specified by either of the following alternative stochastic processes. The first one is the sequence of interarrival times \(\{X_i : i\geq 1\}\). Each \(X_i\) is a positive random variable defined in terms of the arrival epochs such that \(X_1=S_1\) and \(X_i=S_i-S_{i−1}\) for \(i > 1\). Conversely, each arrival epoch \(S_n\) for \(n \geq 1\) is specified by \(\{X_1,\dots,X_n\}\) as \[S_n = \sum_{i=1}^n X_i.\]

When the interarrival sequence is IID, an arrival process is often called renewal process. Hence, the Bernoulli process and Poisson process are special types of renewal process.

The second alternative is the counting process \(\{N(t) : t>0\}\) where, for each \(t>0\), the random variable \(N(t)\) is the total number of arrivals over the time interval \((0,t]\).

We assume \(N(0)=0\) with probability 1 and focus on arrivals at strictly positive epochs.

The relations among three stochastic processes are easily seen in the following figure.

For now, there are two properties to note. The first one is \(N(\tau) \geq N(t)\) for \(\tau>t>0\).

The second one, which is less obvious, is that the following events are equivalent: \[\{S_n \leq t\} = \{N(t) \geq n\}\] for \(n \geq 1\) and \(t > 0\). To see this, notice, on the one hand \(S_n \leq t\) means that the \(n\)th arrival occurs by the epoch \(t\). On the other hand, \(N(t) \geq n\) means that at least \(n\) arrivals occurs by the epoch \(t\).

In short, an arrival process can be specified by one of the following:

Discrete-time arrival process

Let \(\{B_t : t\in\mathbb{Z}_{>0}\}\) denote the Bernoulli process, where each of IID trials is \(B_t\sim\text{Bernoulli}(p)\). We may see the Bernoulli process as an arrival process in the following way:

The Bernoulli process is a discrete-time arrival process \(S_1,S_2,\dots\), where arrival epoch \(S_n\) is a time slot for the \(n\)th arrival. An implication is that \(S_n\geq n\) and \(S_n=n\) only if arrival happens in every slot up to \(n\).

The interarrival times \(X_1,X_2,\dots\) follows IID \(\text{geom}(p)\). (N.B. \(\text{geom}(p)\) here is the number of trials instead of failure.) Consequently, in the counting process \(\{N(t) : t\in\mathbb{Z}_{>0}\}\), for given \(t\), \(N(t) \sim \text{binom}(t,p)\).

We focus on \(\{X_n\}\) because it is an IID \(\text{geom}(p)\) sequence and a lot easier to define the joint distribution than \(\{S_n\}\).

Remember \(S_n = \sum_{i=1}^n X_i \sim \text{nbinom}(n,p)\).

The figure copied from above serves as a good mental picture and provides useful intuition.

Definition & properties

The Poisson process is a continuous-time arrival process \(\{S_n\}\) where each of the interarrival time follows an IID exponential distribution, i.e., \(X_i\sim\text{exp}(\lambda)\) for all \(i\in\mathbb{Z}_{>0}\).

Perhaps, you may think that, in the discrete-time setting, we check whether an arrival happens exactly every minute \((t\in\mathbb{Z}_{>0})\). In the continuous-time setting, instead, we can monitor arrivals all the time \((t\in\mathbb{R}_{>0})\).

In the discrete-time setting, it is straightforward to define an arrival process using \(\text{Bernoulli}(p)\) because an arrival occurs with probability \(p>0\). In contrast, in the continuous-time setting, the probability of arrival at any particular moment is \(0\). As a result, we cannot define an arrival process in the same straightforward way. It is more convenient to define a Poisson process indirectly in terms of the sequence of IID interarrival times \(X_1,X_2,\dots\).

Property 1

In the Poisson counting process \(\{N(t) : t>0\}\), the number of arrivals over \((0,t]\) for some \(t\) follows \(\text{pois}(\lambda t)\) with the PMF of \[P(N(t)=n) = \frac{e^{-\lambda t}(\lambda t)^n}{n!} .\]

Proof. A key observation is that \(N(t)=n\) if and only if \(S_n \leq t < S_{n+1}\) because the \(n\)th arrival occurs before time \(t\) but the \((n+1)\)th occurs after \(t\). Also remember that, by definition, \(X_{n+1}\) is independent of \(X_1,\dots,X_n\) and therefore independent of \(S_n\).

\[\begin{align} P(N(t)=n) &= P(S_n \leq t < S_{n+1})\\ &= P(S_n \leq t, S_{n+1} > t)\\ &= P(S_n \leq t, X_{n+1} > t-S_n)\\ &= \int_0^t\int_{t-s}^\infty f(s,x)dxds\\ &= \int_0^t\int_{t-s}^\infty f_{S_n}(s)f_X(x)dxds \quad (\because X_{n+1} \perp\!\!\!\perp S_n)\\ &= \int_0^t f_{S_n}(s)\left[\int_{t-s}^\infty f_X(x)dx\right]ds\\ &= \int_0^t \lambda e^{-\lambda s}\frac{(\lambda s)^{n-1}}{(n-1)!}\cdot e^{-\lambda(t-s)}ds\\ &= \frac{(\lambda)^{n}}{(n-1)!}e^{-\lambda t} \int_0^t s^{n-1}ds\\ &= \frac{e^{-\lambda t}(\lambda t)^{n}}{n!} \end{align}\]

Recall \(S_n=X_1+\dots+X_n\) follows \(\text{gamma}(n,\lambda )\).

Property 2

For \(s\geq0\), \(N(s+t)-N(t)\) follows \(\text{pois}(\lambda s)\).

Proof. Suppose that, by time \(t\), there have been \(n\) arrivals at time \(s_1, \dots, s_n\). We know \(X_{n+1}>t-s_n\). In addition, by the memoryless property of \(\text{exp}(\lambda)\), \[P(X_{n+1}>t-s_n+s | X_{n+1}>t-s_n) = P(X_{n+1}>s) = e^{-\lambda s}.\] This implies that the distribution of the first arrival after \(t\) also follows \(\text{exp}(\lambda)\) and independent of \(S_1,\dots,S_n\). It is also the case that \(X_{n+2}, X_{n+3},\dots\) are independent of \(S_1,\dots,S_n\) and \(X_{n+1}\). Consequently, a sequence of interarrival times after \(t\) is an IID \(\text{exp}(\lambda)\) and forms a Poisson process.

Property 3

\(N(t)\) has independent increments; i.e., for \(t_0<t_1<\dots<t_n\), \[N(t_n)-N(t_{n-1}), N(t_{n-1})-N(t_{n-2}), \dots, N(t_1)-N(t_0)\] are independent.

Proof. Property 2 implies that \(N(t_n)-N(t_{n-1})\) is independent of \(N(s)\) for \(s\leq t_{n-1}\) and therefore independent of \(N(t_{n-1})-N(t_{n-2}), \dots, N(t_1)-N(t_0).\)

As is often the case in mathematics, we have a choice of definitions from which the other properties are derived. So, we can use the above peoperties as a definition of Poisson process and derive the IID exponential interarrival times.



If you run a tech company to provide online services (e.g., shopping and banking), you need computer servers to respond to customers’ requests. These days, it has become quite popular to rent, rather than buy, computing infrastructure in the cloud. Just like your mobile phone contracts, you want to rent a right amount of the infrastructure. To figure out the right amount, you need models of customers’ requests.

Let’s take one step at a time.

  1. Model how many customers arrive each day.
  2. Model how many requests each customer makes.

Combining them, we have a random variable for the number of request per day. Most likely, we are interested in its mean and variance.

  • Let \(N\) be a random variable for the number of customers per day.
  • Let \(Y_1,Y_2,\dots\) are IID random variables such that \(Y_n\) is the number of requests the \(n\)th customer makes.
  • Let \(S = Y_1+\dots+Y_N\) and \(S=0\) when \(N=0\). This is of our interest.
  • Assume \(N\) and each \(Y_n\) are independent.

Then, implications are:

  1. \(\mathbb{E}S = \mathbb{E}N \cdot \mathbb{E}Y_n\) provided \(\mathbb{E}N, \mathbb{E}Y_n < \infty\).
  2. \(\operatorname{Var} S = \mathbb{E}N \cdot \operatorname{Var} Y_n + \operatorname{Var} N \cdot (\mathbb{E}Y_n)^2\) provided \(\mathbb{E}N^2, \mathbb{E}Y_n^2 < \infty\).
  3. If the customer arrival process follows the Poisson process with rate \(\lambda\), then \(\operatorname{Var} S = \lambda\mathbb{E}(Y_n)^2\).

Proof for (i). \[\begin{align} \mathbb{E}S &= \sum_{n=0}^\infty \mathbb{E}[S|N=n]\cdot P(N=n)\\ &= \sum_{n=0}^\infty \mathbb{E}[Y_1+\dots+Y_n|N=n]\cdot P(N=n)\\ &= \sum_{n=0}^\infty n\mathbb{E}Y_n \cdot P(N=n)\\ &= \mathbb{E}N \cdot \mathbb{E}Y_n \end{align}\]

Proof for (ii). First, for given \(N=n\), \[\begin{align} \mathbb{E}[S^2|N=n] &= \operatorname{Var}[S|N=n] + (\mathbb{E}[S|N=n])^2\\ &= n\operatorname{Var}Y_n + n^2(\mathbb{E}Y_n)^2 \end{align}\]

Then, \[\begin{align} \mathbb{E}S^2 &= \sum_{n=0}^\infty \mathbb{E}[S^2|N=n] \cdot P(N=n)\\ &= \sum_{n=0}^\infty \left(n\operatorname{Var}Y_n + n^2(\mathbb{E}Y_n)^2\right) \cdot P(N=n)\\ &= \mathbb{E}N\cdot\operatorname{Var}Y_n + \mathbb{E}N^2\cdot(\mathbb{E}Y_n)^2 \end{align}\]

As a result, \[\begin{align} \operatorname{Var}S &= \mathbb{E}S^2 - (\mathbb{E}S)^2\\ &= \mathbb{E}N\cdot\operatorname{Var}Y_n + \mathbb{E}N^2\cdot(\mathbb{E}Y_n)^2 - (\mathbb{E}N \cdot \mathbb{E}Y_n)^2\\ &= \mathbb{E}N\cdot\operatorname{Var}Y_n + \operatorname{Var}N \cdot (\mathbb{E}Y_n)^2 \end{align}\]

Proof for (iii). We are interested in the Poisson counting process for \(t=1\). Thus, we have \(N \sim \text{pois}(\lambda)\), which implies \(\mathbb{E}N = \lambda\) and \(\operatorname{Var}N = \lambda\). Now, the result follows from (ii).


Suppose you use a Poisson process with rate \(\lambda\) per hour to model an arrival process of customers during the happy hour. You also know roughly \(100p\)% of customers are women. Now, you wonder what arrival processes will emerge from the following simulations:

  1. Simulate the overall customer arrival process using the original Poisson process,
  2. Randomly assign to each arrival female with probability \(p\) and male with probability \(1-p\).

Interestingly, both female and male arrival processes are again Poisson processes with rate \(p\lambda\) and \((1-p)\lambda\) respectively, and they are independent. Notice, if the splitting were conditioned on \(N(t)=n\), it would be simply \(\text{binom}(n,p)\) so definitely dependent. Doesn’t this come as a surprise?

Let’s examine both counting processes \(\{N_f(t)\}\) and \(\{N_m(t)\}\). Here, we use three properties to show they are Poisson processes.

Proof. First, the independent increments property implies that pairs of increments \[\left(N_f(t_i)-N_f(t_{i-1}),\; N_m(t_i)-N_m(t_{i-1})\right),\quad 1\leq i\leq n\] are independent of each other. Second, by definition, \(N_f(0) = N_m(0) = 0\). Hence, we only need to show that \(X_f = N_f(s+t)-N_f(t)\) and \(X_m = N_m(s+t)-N_m(t)\) both follow correct Poisson distributions and are independent of each other.

Let \(Y = N(s+t)-N(t)\) and \(n\) be the number of arrivals between epoch \(t\) and \(s+t\), i.e. \(Y = n\). Also let \(k\) be the number of female customers over the same period, i.e. \(X_f = k\) and \(X_m = n-k\). Then, \[\begin{align} P(X_f=k, X_m=n-k) &= P(X_f=k, X_m=n-k, Y = n)\\ &= P(X_f=k, X_m=n-k | Y = n)P(Y = n)\\ &= \binom{n}{k}p^k(1-p)^{n-k}\cdot \frac{e^{-\lambda s}(\lambda s)^n}{n!}\\ &= \frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}\cdot \frac{e^{-\lambda s}(\lambda s)^n}{n!}\\ &= \frac{e^{-p\lambda s}(p\lambda s)^k}{k!} \frac{e^{-(1-p)\lambda s} ((1-p)\lambda s)^{(n-k)}}{(n-k)!} , \end{align}\] which implies independent \(X_f \sim \text{pois}(p\lambda)\) and \(X_m \sim \text{pois}((1-p)\lambda)\).