One distribution that Basic Intro Statistics/Business Statistics classes often miss is the Poisson distribution. This is unfortunate, because it comes up quite frequently in a number of applications. It looks a little bit intimidating (probably because of the lambda and the factorial) but it’s really not very complex.

First of all, Poisson random variables are **discrete**, so the Poisson distribution is **discrete**, i.e. a Poisson RV can only take on integer values.

Let’s assume that some **independent **event occurs Lambda (I’m going to denote lambda as L instead of the appropriate symbol because it’s easier and I don’t know if the symbol will translate to WordPress) times over a time interval.

- Independence just means that one occurrence of said event doesn’t affect other occurrences
- If A and B are independent events, then the probability of A
*given B has already occurred,*Pr(A|B), is equal to the probability that A occurs:- Pr(A|B) = Pr(A); the occurrence of event B doesn’t affect event A
- Similarly, Pr(A and B) = Pr(A)Pr(B)

- If A and B are independent events, then the probability of A

Formally, a random variable X is a **Poisson** random variable (with the stipulation that L > 0) if its probability mass function (pmf) has the form:

L indicates the number of successes per time interval. Successes are just specified outcomes that may or may not be considered successful in everyday life – for example, we might measure instances of cancer, in which case we could call getting cancer a success (for study purposes) though getting cancer is, objectively, not a success. Basically what we’re going to be able to model with this random variable is the number of occurrences of some event or phenomenon in a predetermined time interval. Sometimes stats books will call the time interval a “unit of space” or something fancy – don’t get confused. Examples of Poisson random variables:

- The number of patrons who enter a store every 5 minutes
- The number of patrons who enter a store every ___ minutes/hours/days
- The number of phone calls received by a call center every 10 minutes
- The number of phone calls received by a call center every ____ minutes/hours/days

Hopefully I’ve demonstrated that the time interval or unit of space is generic and can be adapted to suit the needs of the researcher.

Since the distribution is discrete, we’re going to use the pdf to find the probability of *exactly *x occurrences of an independent event that occurs, on average, L times over some interval:

We couldn’t find the probability of exactly x occurrences if the random variable were continuous – the probability that a continuous random variable takes on any specific value is 0. For more information, see **The Continuum Paradox**. Essentially, if a variable is continuous we could measure it in smaller and smaller units until there is a difference in two approximately equal observations. For example, suppose two observations in a dataset are both 68 inches tall. Height is a continuous variable – if we wanted to measure down to the centimeter, we might see that the two observations aren’t exactly equivalent. If not, we can measure down to the millimeter. We can continue, down to fractions of nanometers, and the two observations will surely differ in height to some degree. Therefore, we can only find the probability of the continuous random variable taking on a **range** of values (via integration). With a discrete random variable, however, this is not the case – the discrete pdf below clearly illustrates this:

As you can see, each discrete value on the x-axis is associated with a probability on the y-axis, and values in between the integers on the x-axis are not associated with a probability at all, as represented by the lack of pdf in those areas. (for example, there’s no pdf in a straight line above x=10.5, because 10.5 is not a value that the random variable can take on). If this were a continuous pdf, the dots would be connected by a smooth (continuous) line. That’s why we can integrate a continuous pdf from 1 to 5 to find the probability of the random variable taking on a value between 1 and 5, but we have to *sum* the individual probabilities associated with x=1, 2, 3, 4, 5 to find the probability of a discrete random variable taking on a value between 1 and 5 (inclusive). (And thanks to JohnDCook for the picture).

Suppose a store gets, on average, 2 customer complaints every 3 minutes. This store clearly sucks and it’s probably unrealistic, but whatever. How could we find that probability that **4 or fewer customers complain during a 9-minute interval?**

- Given our data (2 customers per 3 minutes) we’d expect 2×3 = 6 customers per 9 minutes, on average
- The probability that 4 or fewer customers complain is:
- Pr(0 customers) + Pr(1 customer) + Pr(2 customers) + Pr(3 customers) + Pr(4 customers)
- Pr(0;6) + Pr(1;6) + Pr(2;6) + Pr(3;6) + Pr(4;6)
- Using the formula for f(x;L) with x = 0, 1, 2, 3, and 4 and L = 6:

- Recall that 0! = 1 by definition, so you’re not dividing by 0 in the 1st term

So, the chance that 4 or fewer customers complain during a 9-minute interval, given an average of 2 customer complaints every 3 minutes, is 28.5%. Intuitively, this is reasonable – given the information the result should be relatively unlikely but not extremely rare, so 28.5% fits that description.

## Leave a Reply