Lesson 12: The Poisson Distribution

Introduction

In this lesson, we learn about another specially named discrete probability distribution, namely the Poisson distribution.

Objectives

Poisson Distributions

Situation

Let the discrete random variable X denote the number of times an event occurs in an interval of time (or space). Then X may be a Poisson random variable with x = 0, 1, 2, ...

Examples

  1. Let X equal the number of typos on a printed page. (This is an example of an interval of space — the space being the printed page.)
  2. Let X equal the number of cars passing through the intersection of Allen Street and College Avenue in one minute. (This is an example of an interval of time  — the time being one minute.)
  3. Let X equal the number of Alaskan salmon caught in a squid driftnet. (This is again an example of an interval of space — the space being the squid driftnet.)
  4. Let X equal the number of customers at an ATM in 10-minute intervals.
  5. Let X equal the number of students arriving during office hours.

Definition. If X is a Poisson random variable, then the probability mass function is:

\(f(x)=\dfrac{e^{-\lambda} \lambda^x}{x!}\)

for x = 0, 1, 2, ... and λ > 0, where λ will be shown later to be both the mean and the variance of X.

Recall that the mathematical constant e is the unique real number such that the value of the derivative (slope of the tangent line) of the function f(x) = ex at the point x = 0 is equal to 1. It turns out that the constant is irrational, but to five decimal places, it equals:

e = 2.71828

Also, note that there are (theoretically) an infinite number of possible Poisson distributions. Any specific Poisson distribution depends on the parameter λ.

"Derivation" of the p.m.f.

Let denote the number of events in a given continuous interval. Then X follows an approximate Poisson process with parameter λ > 0 if:

(1) The number of events occurring in non-overlapping intervals are independent.

(2) The probability of exactly one event in a short interval of length h = 1/n is approximately λh = λ(1/n) = λ/n.

(3) The probability of exactly two or more events in a short interval is essentially zero.

With these conditions in place, here's how the derivation of the p.m.f. of the Poisson distribution goes:

Now, let's make the intervals even smaller. That is, take the limit as n approaches infinity (n → ∞) for fixed x. Doing so, we get:

Finding Poisson Probabilities

printed pageExample

Let X equal the number of typos on a printed page with a mean of 3 typos per page. What is the probability that a randomly selected page has at least one typo on it?

Solution. We can find the requested probability directly from the p.m.f. The probability that X is at least one is:

P(X ≥ 1) = 1 − P(X = 0)

Therefore, using the p.m.f. to find P(X = 0), we get:

\(P(X \geq 1)=1-\dfrac{e^{-3}3^0}{0!}=1-e^{-3}=1-0.0498=0.9502\)

That is, there is just over a 95% chance of finding at least one typo on a randomly selected page when the average number of typos per page is 3.

What is the probability that a randomly selected page has at most one typo on it?

Solution. The probability that X is at most one is:

P(X ≤ 1) = P(X = 0) + P(X = 1)

Therefore, using the p.m.f., we get:

\(P(X \leq 1)=\dfrac{e^{-3}3^0}{0!}+\dfrac{e^{-3}3^1}{1!}=e^{-3}+3e^{-3}=4e^{-3}=4(0.0498)=0.1992\)

That is, there is just under a 20% chance of finding at most one typo on a randomly selected page when the average number of typos per page is 3.

Just as we used a cumulative probability table when looking for binomial probabilities, we could alternatively use a cumulative Poisson probability table, such as Table III in the back of your textbook. If you take a look at the table, you'll see that it is three pages long.  Let's just take a look at the top of the first page of the table in order to get a feel for how the table works:

In summary, to use the table in the back of your textbook, as well as that found in the back of most probability textbooks, to find cumulative Poisson probabilities, do the following:

  1. Find the column headed by the relevant λ. Note that there are three rows containing λ on the first page of the table, two rows containing λ on the second page of the table, and one row containing λ on the last page of the table.
  2. Find the x in the first column on the left for which you want to find F(x) = P(Xx).

Let's try it out on an example. If X equals the number of typos on a printed page with a mean of 3 typos per page, what is the probability that a randomly selected page has four typos on it?

Solution. The probability that a randomly selected page has four typos on it can be written as P(X = 4). We can calculate P(X = 4) by subtracting P(X ≤ 3) from P(X ≤ 4). To find P(X ≤ 3) and P(X ≤ 4) using the Poisson table, we:  

  1. Find the column headed by λ = 3
  2. Find the 3 in the first column on the left, since we want to find F(3) = P(X ≤ 3). And, find the 4 in the first column on the left, since we want to find F(4) = P(X ≤ 4).

Now, all we need to do is (1) read the probability value where the λ = 3 column and the x = 3 row intersect, and (2) read the probability value where the λ = 3 column and the x = 4 row intersect. What do you get?

Poisson table

Do you need a hint?

 The cumulative Poisson probability table tells us that finding P(X ≤ 4) = 0.815 and P(X ≤ 3) = 0.647. Therefore:

P(X = 4) = P(X ≤ 4) − P(X ≤ 3) = 0.815 − 0.647 = 0.168

That is, there is about a 17% chance that a randomly selected page would have four typos on it. Since it wouldn't take a lot of work in this case, you might want to verify that you'd get the same answer using the Poisson p.m.f.

What is the probability that three randomly selected pages have more than eight typos on it?

Solution. Solving this problem involves taking one additional step. Recall that X denotes the number of typos on one printed page. Then, let's define a new random variable Y that equals the number of typos on three printed pages. If the mean of X is 3 typos per page, then the mean of Y is:

λY= 3 typos per one page × 3 pages = 9 typos per three pages

Finding the desired probability then involves finding:

P(Y > 8) = 1 − P(Y ≤ 8)

where P(Y ≤ 8) is found by looking on the Poisson table under the column headed by λ = 9.0 and the row headed by x = 8. What do you get?

Poisson Table

Do you need a hint?

 The cumulative Poisson probability table tells us that finding P(X ≤ 8) = 0.456. Therefore:

P(Y > 8) = 1 − P(Y ≤ 8) = 1 - 0.456 = 0.544

That is, there is a 54.4% chance that three randomly selected pages would have more than eight typos on it.

Poisson Properties

Just as we did for the other named discrete random variables we've studied, on this page, we present and verify four properties of a Poisson random variable.

Theorem. The probability mass function:

\(f(x)=\dfrac{e^{-\lambda} \lambda^x}{x!}\)

for a Poisson random variable X is a valid p.m.f.

 Proof.

Theorem. The moment generating function of a Poisson random variable X is:

\(M(t)=e^{\lambda(e^t-1)}\text{ for }-\infty<t<\infty\)

 Proof.

Theorem. The mean of a Poisson random variable X is λ.

 Proof.

Theorem. The variance of a Poisson random variable X is λ.

 Proof.

Approximating the Binomial Distribution

christmas tree lightsExample

Five percent (5%) of Christmas tree light bulbs manufactured by a company are defective. The company's Quality Control Manager is quite concerned and therefore randomly samples 100 bulbs coming off of the assembly line. Let X denote the number in the sample that are defective. What is the probability that the sample contains at most three defective bulbs?

Solution. Can you convince yourself that X is a binomial random variable? Hmmm.... let's see... there are two possible outcomes (defective or not), the 100 trials of selecting the bulbs from the assembly line can be assumed to be performed in an identical and independent manner, and the probability of getting a defective bulb can be assumed to be constant from trial to trial. So, X is indeed a binomial random variable. Well, calculating the probability is easy enough then... we just need to use the cumulative binomial table with n = 100 and p = 0.05.... Oops!  Surprised The table won't help us here, will it?  Even many standard calculators would have trouble calculating the probability using the p.m.f.:

\(P(X\leq 3)=\dbinom{100}{0}(0.05)^0 (0.95)^{100}+\cdots+\dbinom{100}{3}(0.05)^3 (0.95)^{97}\)

Using a statistical software package (Minitab), I was able to use the binomial p.m.f. to determine that:

P(X ≤ 3) = 0.0059205 + 0.0311607 + 0.0811818 + 0.1395757 = 0.25784....

But, if you recall the way that we derived the Poisson distribution,... we started with the binomial distribution and took the limit as n approached infinity. So, it seems reasonable then that the Poisson p.m.f. would serve as a reasonable approximation to the binomial p.m.f. when your n is large (and therefore, p is small). Let's calculate P(X ≤ 3) using the Poisson distribution and see how close we get. Well, the probability of success was defined to be:

\(p=\dfrac{\lambda}{n}\)

Therefore, the mean λ is:

\(\lambda=np\)

So, we need to use our Poisson table to find P(X ≤ 3) when λ = 100(0.05) = 5. What do you get?

Poisson table with lambda = 5

Do you need a hint?

The cumulative Poisson probability table tells us that finding P(X ≤ 3) = 0.265. That is, if there is a 5% defective rate, then there is a 26.5% chance that the a randomly selected batch of 100 bulbs will contain at most 3 defective bulbs. More importantly, since we have been talking here about using the Poisson distribution to approximate the binomial distribution, we should probably compare our results. When we used the binomial distribution, we deemed P(X ≤ 3) = 0.258, and when we used the Poisson distribution, we deemed P(X ≤ 3) = 0.265. Not too bad of an approximation, eh?  

It is important to keep in mind that the Poisson approximation to the binomial distribution works well only when n is large and p is small. In general, the approximation works well if n ≥ 20 and p ≤ 0.05, or if n ≥ 100 and p ≤ 0.10.