A dive into one of the lesser known probability distributions

Perhaps you have heard of the binomial distribution, but have you heard of its cousin the negative binomial distribution? This discrete probability distribution is applied in numerous industries such as insurance and manufacturing (mainly count-based data), hence is a useful concept for Data Scientists to understand. In this article, we will dive into this distribution and what problems it can solve.

To understand the negative binomial distribution, it’s important to gain intuition about the binomial distribution.

The binomial distribution measures the probability of measuring a certain number of successes, x, in a given number of trials, n. The trials in this case are Bernoulli trials, where every outcome is binary (success or failure). If you are unfamiliar with the binomial distribution, check out my previous post on it here:

The negative binomial distribution flips this and models the number of trials, x, needed to reach a certain number of successes, r. This is why it is known as ‘negative’ because it is inadvertently modeling the number of failures before the certain number of successes.

A better way of thinking about the negative distribution is:

Probability of the “r” success happening on the “x” trial

A special case of the negative binomial distribution is the geometric distribution. This models the number of trials needed before we get our first success. You can read more about the geometric distribution here: