There are many interesting events that happen at points in time. For example, the arriavals of busses at a bus stop, accidents on a highway, goals scored in a soccer (football) game. The processes that model such point in time events are called “point processes”. An important consideration in these processes is how long it takes from one event to the next. For example, given you just missed a bus, how long will you have to wait for the next bus? This time is a random variable and the choice of the random variable specifies the point process. One choice for this random variable is something that isn’t very random (deterministic numbers are just a special case of random ones). Busses arriving on a punctual schedule, every 10 minutes for example. This might sound like the simplest possible point process, but there is something even simpler. And it arises when the times between events follow an exponential distribution (the process is called the Poisson process). It’s called the exponential distribution for good reason. It is tied to Euler’s number, *e* and compound interest. In this article, we’ll see the connection.

Say you deposit 1$ in the bank. The interest rate is *x* per year. At the end of the year, your balance will be *(1+x)*. To get more money, you ask the bank to pay you the interest monthly instead of yearly. Since the rate is *x* per year, the interest you’ll earn in a month will be *x/12*. And you immediately re-invest the interest. So for the second month, your investment becomes *(1+x/12)* and this in-turn grows by a factor of *(1+x/12)* meaning the amount after 2 months is *(1+x/12)²*. Repeating this for 12 months, your balance at the end of the year will be *(1+x/12)¹²*. Using the Binomial theorem, this new balance at the end of the year is:

We can see that this is more than the *(1+x)* we ended up with before. This makes sense since we were getting interest throughout the months and the interest was re-invested and earning further interest on top. But why stop at *12* intervals? You want to compound as frequently as possible. Every millisecond if the bank will allow it. Instead of 12 intervals, we generalize to *n* intervals and make *n* really large. After every interval, our balance grows further by *(1+x/n)^n*. And at the end of the year the amount we’ll have,

Expanding this out with the binomial theorem,

As *n* becomes larger, the *n-1*, *n-2*, etc. are practically the same as *n*. So, all those terms involving *n* cancel out between the numerators and denominators (since we have *n→∞*) and we’re left with:

If we differentiate *B(x)* with respect to *x*, we get *B(x)* back. If we plug in *x=1*, we get a very special number. Can you guess? It’s readily apparent from the first two terms that this number is greater than *2*.

We’ve just re-discovered the famous Eulers number, *e=2.71828..*. And it turns out, *B(x)=e^x*. This wasn’t immediately obvious to me, but we can see this by going back to equation (1).

We have *e* in the second equation, but not in the first one. The *x/n* term inside the bracket is kind of getting in the way of that. to clean it up, let’s change up the variables by defining:

This will make equation (1):

Note that taking the *x* outside the limit like we did above is allowed for continuous functions.

So that was compound interest and the motivation for the number *e*. How does all this relate to point processes and the exponential distribution? The exponential distribution works in continuous time and models the time until some event (like a car accident).

The best way to understand it is to think of the limit of tossing coins.

The claim to fame of the exponential distribution is the fact that it is memory less. In fact, it is the only continuous distribution that is memory less. If you’re waiting for a bus whose time until arrival is exponentially distributed, then it doesn’t matter how long you’ve waited already.

The distribution of the additional time you have to wait is exactly the same weather you just arrived or have been waiting for ten hours. This property makes the exponential distribution very easy to work with.

Its easier to understand this property when we make things discrete. Instead of waiting in continuous time for a bus to arrive, imagine tossing a coin every minute and waiting to see a heads. The number of tails we’ll see before we see the first heads is a discrete random variable since it can take only non negative integer values (unlike the bus arrival time which can be any real number like 3.4 minutes). This discrete distribution is called the Geometric distribution.

A real world scenario where the Geometric distribution applies perfectly is a slot machine in a casino which a gambler keeps playing until he hits a jackpot. Every spin of the machine is independent of the spins in the past. Which means that this Geometric distribution is also memory less. When people think that a machine hasn’t yielded jackpot for a long time and so one is “due”, they aren’t accounting for the memory-less nature of the process and falling prey to the “gamblers fallacy”.

We can model each spin of the machine by the toss of a coin. The coin has a probability *p* of heads. We start tossing this coin. What is the probability that we haven’t seen a heads after *k* tosses? This simply means that we’ve seen *k* consecutive tails. The probability of this is:

Now, we want to move to continuous time. So, we split a continuous timeline into discrete parts. The coin tosses happen at each of those discrete events. Each unit interval of *t* is divided into a large number, *d *of discrete parts.

And now we denote by *T* the time at which an interesting event (coin coming up heads) occurs. To get the distribution of *T*, we again target its survival function, the probability that it is greater than some number, *t.* We know that a total of [t/d] tosses must have happened by this time (where [.] is the greatest integer function). For example, if *t=10* and each unit interval of time is divided into *3* parts, then* [10/3] = [3.33] = 3 *tosses would have happened by then. To make this a truly continuous time process, we need to make *d *so small that it vanishes. But as we make *d* small, we end up with an increasing number of coin tosses. So, our *p* must also become small to compensate (otherwise, the events will become so frequent that any minuscule interval of time will have many events). So, the *p* and *d* variables must go to *0* simultaneously. Using the equation above for the discrete case, the number of tosses that have happened by time *t* and the fact that our *p* and *d* variables must go to *0* we get the survival function of *T:*

The second limit just becomes 1.

This limit is interesting only when *p* and *d* decrease to *0* together in a linear relationship with each other. Because both are going to zero together, the line has to have an intercept of *0*. Let’s say the line is:

This is the step most people have trouble with. Why all of a sudden this equation? Where did it come from? If we ask what the double limit in equation (3) equals, the answer is going to be that “it depends”. Depends on the relationship between *p* and *d*. For one thing, we know that *p* and *d* are approaching *0* together. So the relationship between them must pass through *(0,0)*. Next, we need to pick the functional form between them. And it is up to us what to pick. But if we pick anything but a linear relationship, we get a trivial answer (like *0* or *1 *for any *t*) and don’t get an interesting continuous distribution.

With the linear relationship above, equation (3) becomes:

This limit is interesting only when *p* and *d* decrease to *0* together in a linear relationship with each other. Because both are going to zero together, the line has to have an intercept of *0*. Let’s say the line is:

Equation (3) above becomes:

We need another substitution to make this align with equation (1):

Which is the survival function of the exponential distrubution. We have gone here from the Geometric distribution, taken the limit and derived the exponential distribution, all while using results derived from compound interest.